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MUdN-DERIVED PROIEINS FOR THE DIAGNOSIS. IMAGING, AND THERAPY OF HUMAN 
CANCER 



TECHNICAL FIELD 
The present invention relates to a newly-discovered 
group of protein products of the MUCl gene and diagnostic 
and therapeutic methods for utlizing the same, as well as 
diagnostic and therapeutic compositions containing the 
same. 

BACKGROUND OF THE INVENTI€» 

Polymorphic, high molecular weight glycoproteins are 
abundantly expressed in human breast carcinomas. These 
proteins, designated MUCl (also referred to as episialin, 
H23Ag, PEM, EMA, CA15-3, MCA, etc) are heavily glycosylated 
with 0-glycosidic-linked carbohydrate side chains, and, as 
such, have mucin-like characteristics [for review, see J. 
Hilkens, et al., "Cell Membrane-Associated Mucins and Their 
Adhesion Modulating Property," TIBS , Vol. 17, pp. 359-363 
(1992)]. Although MUCl proteins are expressed at basal 
levels by . most secretory epithelial tissues, their 
e-xpression is dramatically increased in malignant breast 
epithelial cells [P.X. Xing, et al., "Reactivity of 
Anti-Human Milk Fat Globule Antibodies with Synthetic 
Peptides," J. Immunol. . Vol. 142, pp. 3503-3509 (1989)]. The 
fact that disease status in breast cancer patients is 
routinely assessed by monitoring the serum levels of 
circulating tandem repeat array contgiining MUCl protein, 
using commercial assays such as CA15-3 and MCA (mammary 
carcinoma antigen) underscores the unequivocal importance of 
MUCl gene expression to human breast cancer. That increased 
MUCl expression may reflect a change in the differentiation 
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status of the malignant epithelial cells is indicated by 
high levels of MUCl expression also in lactating mammary 
epithelial tissue, where it is localized at the apical 
surfaces. Due to the loss of cellular architecture in 
breast cancer tissue, MUCl is no longer expressed solely on 
the apical surface and this, in conjunction with the finding 
that MUCl expresssion reduces cell-cell adhesion [M.J»L. 
Ligtenberg, et al-, "Suppression of Cellular Aggregation by 
High Levels of Episialin," Cancer Res> , Vol. 52, 
pp. 2318-2324 (1992)], may enhance the invasiveness of the 
breast cancer cell. 

Molecular studies, including cDNA and gene cloning, 
have elucidated many properties of the MUCl proteins 
[D.H. Wreschner, et al., "Isolation and Characterization of 
Full Length cDNA Coding for the H23 Breast Tumor Associated 
Antigen," in Breast Cancer: Progress in Biology > Clinical 
Management and Prevention , M.A. Rich, J.C. Hager and I. 
Keydar, Eds., Kluwer Academic Piiblishers, Boston, Mass., 
U.S.A., pp. 41-59 (1989); D.H. Wreschner, et al. , "Human 
Epithelial Tumor Antigen cDNA Sequences - Differential 
Splicing May Generate Multiple Protein Forms," Eur. J. 
Biochenw, Vol. 189, pp. 463-473 (1990)]. The MUCl gene 
product best characterized so far is a polymorphic, type 1 
transmembrane molecule that consists of a large 
extracellular domain, a transmembrane domain and a 69 amino 
acid cytoplasmic tail. The genetic polymorphism derives from 
a 20 amino acid repeat motif rich in serine, threonine and 
proline residues, that varies in number from approximately 
20 to 100 repeats. The feature of a tandemly repeating 
domain is shared by all cloned human, porcine and Xenopus 
mucins (MUC2, MUC3, human tracheobronchial mucin MUC4, MUC5, 
porcine s\ibmaxillary mucin and Xenopus integumentary mucin) - 
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This common property notwithstanding, several unique 
features distinguish the IQJCl proteins from the other 
mucins. First, whereas the latter mucins have several 
cysteine residues in their extracellular domains that form 
disulfide bridges, thereby generating a mucin network, the 
MUCl proteins have no cysteine residues in their 
extracellular domain, and thus are less likely to have this 
mesh-forming capability* Second, and perhaps most 
significantly, the MUCl protein is a type 1 transmembrane 
protein, a molecular structure not shared by the other mucin 
molecules, that are secreted from the cell. 

Insights into the function of MUCl gene products have 
been furnished by analyzing the phenotype of tandem repeat 
array containing transmembrane MUCl transf ectants - This has 
shown that MUCl expression reduces cellular adhesion 
[Ligtenberg, et al., Cancer Res , , ibid. (1992)]. 
Interestingly, a comparison of the human MUCl amino acid 
sequence with the mouse MUCl homologue [A. P. Spicer, et al., 
"Molecular Cloning of the Mouse Homologue of the Tumor 
Associated Mucin, MUCl, Reveals Conservation of Potential 
0-Glycosylation Sites, Transmembrane and Cytoplasmic Domains 
and a Loss of Minisatellite-Like Polymorphism," J, Biol« 
Chenu, Vol. 266, pp. 15099-15109 (1991)] shows that whereas 
a tandem repeat structure rich in serine and threonine 
residues is also observed in the mouse protein, there is 
very little conservation of actual amino acid sequence in 
this region. This indicates that perhaps the primary 
function of mucin tandemly repeated domains is to provide 
the "infrastructure" for extensive 0- linked glycosylation, 
thereby conferring to the molecule its anti-adhesion 
function. Recent experiments liave indeed shown that the 
tandem repeat array mediates this anti-adhesive feature of 
MUCl protein. 
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As described above, expression of the polymorphic HUCl 
proteins reduces cellulax aggregation potential, suggesting 
that MUCl interference with cellular interactions may be 
critical in tissue morphogenesis such as ductal development 
by glandular epithelial cells in normal tissues [J. Hilkens, 
et al., ibid., (1992)], and could be responsible for the 
detachment of tumor cells from malignant tissues where it is 
expressed at high levels [Ligtenberg, et al.. Cancer Res . , 
ibid- (1992)]. 

Comparison of MUCl sequences in different species may 
provide additional insights into functionally important 
regions of MUCl gene products. For example, the mouse MUCl 
homologue shows, in contrast to the lack of similarity 
within the tandem repeating sequence, a very high degree of 
amino acid sequence conservation with human MUCl, in the 
cytoplasmic cuid transmembrane domains as well as in the 120 
amino acids N-terminal to the transmembrane domain* This 
degree of amino acid sequence similarity is almost 90% in 
the cytoplasmic and transmembrane domains, indicating that 
these regions, as well as the 120 amino acids N-terminally 
adjacent to the transmembrane domain, may be functionally 
very important- This contrasts with the lack of 
inter-species conservation of the MUCl tandem repeat array 
amino acid sequence, thereby suggesting that distinct 
functions may be performed by the tandem repeat array and by 
the other highly-conserved regions of the MUCl proteins. 

SUMMARY OF THE INVENTION 
According to the present invention, there has now been 
identified and characterized a group of novel protein 
products of the MUCl gene. 



wo 96^3502 



PCT/IB9S/00627 



5 



More particularly, the present invention relates to 
novel proteins designated herein as MUCl/X, MUCl/X/alt^ 
MUCl/Y^ MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt^ 
KUCl/Z and MUCl/Z/alt, which function as receptor proteins 
and activating ligands for said receptors in human breast 
cancer cells, and which proteins are all characterized by 
the cLbsence of the characteristic MUCl protein tandem repeat 
array. 

Thus, according to the present invention, there is now 
provided a biochemically pure MUCl protein, selected from 
the group consisting of MUCl/X, MUCl/X/alt, MUCl/Y, 
MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z, 
and MUCl/Z/alt, or a functional derivative thereof, devoid 
of a tandem repeat array. 

The term "functional derivative" as used herein is 
intended to include labelled proteins, conjugated proteins, 
fused chimeric proteins and purified receptors in soluble 
form, as well as fragments, deletions, and conservative 
substitutions of said proteins. 

As will be realized, the biochemically pure MUCl 
proteins as defined and claimed herein are isolated and 
purified and are thus substantially free of natural 
contaminants . 

The term "conservative substitutions" as used herein is 
intended to denote substitutions which preserve the activity 
of the defined proteins, involving between 80% to 90% 
conservation* 

More specifically, the present invention provides a 
biochemically pure MUCl protein selected from the group 
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consisting of MUCl/X, MUCl/X/alt^ MOCl/Y, MUCl/Y/alt, 
MUCl/W, MUCl/W/alt, MOCl/Z, and MOCl/Z/alt^ or a functional 
derivative thereof, comprising a partial amino acid 
sequence : 

MTPGTQSPFFLLLLLTVLTIATTAPKPAT] 
VVTGSGHASSTPGGEKETSATQRSSVP 
S S T E K N A 

and devoid of a tandem repeat array downstream thereof. 

Especially, the present invention provides a 
biochemically pure MUCl protein selected from the group 
consisting of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, 
MUCl/W, MUCl/W/alt, MUCl/Z, and MUCl/Z/alt, or a functional 
derivative thereof, having a partial amino acid sequence: 

MTPGTQSPFFLLLLLTVLT [ATTAPKPAT] 
VVTGSGHASSTPGGEKETSATQRSSVP 
S S T E K N A 

and devoid of a tandem repeat array downstream thereof. 

Furthermore, the present invention provides a 
biochemically pure MUCl protein selected from the group 
consisting of MUCl/V, MUCl/V/alt, or a functional derivative 
thereof, comprising a partial amino acid sequence: 

MTPGTQSPFFLLLLLTVLT [ATTAPKPAT] 
VVTGSGHASSTPGGEKETSATQRSSVPS 

and devoid of a tandem repeat array downstream thereof. 
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Still furthermore r the present invention provides a 
biochemically pure HUCl protein selected from the group 
consisting of MUCl/V, MUCl/V/alt, or a fiinctional derivative 
thereof, having a partial amino acid sequence: 

MTPGTQSPFFLLLLLTVLT [ATTAPKPAT] 
VVTGSGHASSTPGGEKETSATQRSSVPS 

and devoid of a tandem repeat array downstream thereof. 

The sequence steurts at the amino (NH^) terminal 
methionine (H) residue. The 9 amino acid sequence presented 
in brackets [ATTAPKPAT] represents an isoform that 
is generated by an alternative splice acceptor site. 
Hereinafter, MUCl derivaties containing this additional 9 
amino acid sequence will be referred to as the "/alt 
configuration" of the novel MUCl derivatives described 
herein. The two arrows indicate the sites at which cleavage 
of the signal sequence is expected to occ\ir (Fig. 2). 

Specifically, the present invention provides 
biochemically pure MUCl/X and MUCl/X/alt, respectively 
comprising the sequences shown in Figs. 5A and SB and 
functional derivatives thereof; biochemically pure MUCl/Y 
and MUCl/Y/alt respectively comprising the sequences shown 
in Figs. 6A and 6B and functional derivatives thereof; 
biochemically pure MUC 1/V, MUCl/V/alt, respectively 
comprising the sequences shown in Figs. 6C and 6D and 
functional derivatives thereof; MUCl/W and MUCl/W/alt 
respectively comprising the sequences shown in Figs. 7 A and 
7B and functional derivatives thereof; and biochemically 
pure MUCl/Z and MUCl/Z/alt respectively comprising the 
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secpiences shown in Figs. 8A and 8B and functional 
derivatives thereof. 

More particularly, the present invention provides 
biochemically pure MUCl/X and MUCl/X/alt, respectively 
having the sequences shown in Figs. 5A and 5B and 
fiinctional derivatives thereof; biochemically pure MUCl/Y 
and MUCl/Y/alt respectively having the sequences shown 
in Figs. 6A and 6B and functional derivatives thereof; 
biochemically pure MUC 1/V, MUCl/V/alt, respectively 
comprising the sequences shown in Figs. 6C and 6D and 
functional derivatives thereof; MUCl/W and MUCl/W/alt 
biochemically pure MUCl/W and MUCl/W/alt respectively having 
the sequences shown in Figs. 7A and 7B and functional 
derivatives thereof; and biochemically pure MUCl/2 and - 
MUCl/Z/alt respectively having the sequences shown in 
Figs. 8A and 8B and f\inctional derivatives thereof. 

MUCl/X and MUCl/Y have been found to be generated by a 
splicing mechanism, using perfect splice donor and splice 
acceptor sites, located upstream and downstream to the 
tandem repeat array of MUCl while maintaining the original 
reading frame, and therefore these proteins retain the 
cytoplasmic and transmembrane domains, as well as the amino 
.acids immediately N- terminal to the transmembrane domain 
(Figs. lA and IB, Fig. 2, Fig. 3 and Fig. 4). 

MUCl/V has been found to be generated by a splicing 
mechanism, using a different splice donor and splice 
acceptor sites, located upstream and downstream to the 
tandem repeat array of MUCl while also maintaining the 
original reading frame and therefore these proteins retain 
the cytoplasmic and transmembrane domains, as well as the 
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amino acids immediately N-terminal to the transmembrane 
domain. 

On the other hand, MUCl/W and MUCl/Z are generated by a 
splicing mechanism in which the original reading frame is 
not maintained and therefore the proteins do not include the 
cycloplasmic and transmembrame domains (Figs. lA and IB, 
Fig. 2, Fig. 3 and Fig. 4) and are therefore secreted from 
the cell. 

Further extensive research, testing and analysis 
indicate that MUCl/X, MUCl/Y, MUCl/V and their /alt 
configurations serve as receptor proteins in breast cancer 
cells, while MUCl/W and MUCl/Z and their /alt configurations 
function as ligands for said receptors. 

In contrast to the new MUCl/X, MUCl/X/alt, MUCl/Y, 
MUCl/Y/alt, MUCl/V, and MUCl/V/alt proteins that are 
continuous from their N-terminal extracellular domains 
through to their C- terminal cytoplasmic domains (Fig. 3, 
Fig. 12 and Fig. 13), the tandem repeat array containing 
MUCl protein is proteolytically cleaved in its extracellular 
domain [Ligtenberg, et al., **Cell Associated Episialin Is a 
Complex Containing Two Proteins Derived From a Common 
Precursor," J, Biol. Chem. , Vol. 267, pp. 6171-6177 (1992)]- 
Integrity of the MUCl extracellular domain as in the MUCl/X, 
MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V and MUCl/V/alt 
proteins is likely to be essential for ligand binding. 

Furthermore, the MUCl amino acid sequence reveals 
striking similarities to sequences in the extracellular 
domain of cytokine receptors that are known to participate 
in ligand binding. Significantly, this homology maps in 
close proximity to the region where proteolytic cleavage 
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occiirs in the tandem repeat array containing MUCl protein, 
suggesting that integrity of this site in the MUCl/X, 
MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V and MUCl/V/alt 
proteins is of prime importamce for both ligand binding and 
signal transmission. This demonstrates that the MUCl/X, 
MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V and MUCl/V/alt 
proteins are cytokine-like receptor molecules. 

Furthermore, experiments carried out with the MUCl 
proteins described previously in the literature 2uid which 
are characterized by the presence of the tandem repeat 
array, showed that these proteins do not transform cells 
into Ccuicerous cells, and specifically when expression 
vectors containing cDNA coding for the tandem repeat array 
MUCl protein were transfected into eucaryotic cells, the 
said transf ectants did not become turoorigenic. In 
contradistinction thereto, transf ection of expression 
vectors containing cDNA coding for the MUCl/Y protein of the 
present invention into cells, caused the said cells to 
become t\imor igenic , as described hereinbelow. 

As is known, the biological effects of many factors 
controlling cell proliferation, differentiation and 
metabolism are mediated by membrane- located proteins 
(receptors) that participate in signal transduction 
processes. Invariably, growth factor binding to specific 
cell surface receptors initiates a signalling cascade that 
is transduced in many cases via phosphorylation of tyrosine 
residues within the receptor protein [M.J. Paszin and L.T. 
Williams, "Triggering Signalling Cascades by Receptor 
Tyrosine Kinases," TIBS , Vol. 17, pp, 374-378 (1992)]. 
Assembly of receptor signalling complexes formed between the 
receptor protein and SRC homology 2 (SH2) domain containing 
proteins that interact with phosphorylated tyrosine residues 
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present in the receptor cytoplasmic domain, mediates the 
signal transduction process. This triggering ultimately 
results in the activation of specific gene expression 
involving transcription of both immediate and delayed 
response genes. 

A number of cell surface receptor proteins are likely 
involved in both the origin and progression of human breast 
cancer - a prime example is the neu (erbB-2) membrane 
located receptor molecule I D.J. Slamon, et al., "Studies on 
the HER-2/neu Protooncogene in Human Breast and Ovarian 
Cancer," Science , Vol. 244, pp. 707-712 (1989)]. It is 
therefore unfortunate to note, however, that only 
exceptionally few genes that code for signal transducing 
molecules in general, and membrane- located receptor proteins 
in particular, have to date been implicated in the 
development of human breast cancer. 

Thus, as stated above, there have now been identified 
and characterized novel protein products of the MUCl gene, 
designated herein as MUCl/X, MUCl/Y and MUCl/V, that reside 
in the cell membrane and function as receptor proteins, and 
are highly expressed in human breast cancer tissue. There 
have also now been identified and characterized novel 
protein products of the MUCl gene, designated herein as 
WOJCl/W and MUCl/Z, the latter of which has been found to 
function as ligands, and the former of which is believed to 
have a similar function, based on its structure. 

These proteins and the /alt configurations thereof, as 
well as functional derivatives thereof, form the basis of 
the present invention. 
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Thus, the present invention further provides a 
pharmaceutical composition comprising as an active 
ingredient therein a biochemically purified MUCl protein 
selected from the group consisting of MUCl/X, MUCl /X/ alt, 
MUCl/Y, MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, 
MUCl/Z, MUCl/2/alt and functional derivatives thereof, 
devoid of a tandem repeat array. 

More specifically, the present invention provides, 
inter alia, a pharmaceutical composition for the treatment 
of human bre2ust cancer, comprising as an active ingredient 
therein a biochemically p\ire MUCl protein selected from the 
group consisting of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, 
MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z, MUCl/Z/alt 
and functional derivatives thereof, in soluble form and in 
combination with a pharmaceutically acceptable carrier. 

The invention also provides a conjugated toxin for the 
treatment of human breast cancer, comprising a MUCl protein 
selected from the group consisting of MUCl/W, MUCl /W/ alt, 
MUCl/Z, MUCl/Z/alt, and functional derivatives thereof, 
attached to a cytotoxic agent. 

In another aspect of the present invention, there is 
provided a diagnostic agent for the detection of human 
breast cancer cells, comprising a detectable labelled MUCl 
protein selected from the group consisting of MUCl/W, 
MUCl/W/alt, MUCl/Z, MUCl/Z/alt, and functional derivatives 
thereof , 

The invention also provides a diagnostic agent for 
identification of sites in the body to which breast cancer 
cells have spread, comprising a detectable labelled MUCl 
protein selected from the group consisting of MUCl/W, 
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MOCl/W/alt, MUCl/Z, MUCl/Z/alt, and functional derivatives 
thereof . 

As will be realized from the above, the invention also 
includes a method for the treatment of human breast cancer^ 
comprising administering to an individual having human 
breast cancer cells an amount of soluble MUCl/X, MUCl/X/alt, 
MUCl/Y, MUCl/Y/alt, MUCl/V, or MUCl/V/alt receptors, 
sufficient to inhibit the binding of MUCl ligands to said 
cells. 

In yet another aspect of the present invention, there 
is provided a method for the treatment of h\nnan breast 
cancer, comprising administering to an individual having 
human breast cancer cells an amount of a ligand-toxin 
conjugant comprising a ligand selected from MUCl/W, 
MUCl/W/alt, MUCl/2 or MUCl/Z/alt, fused to a cytotoxic 
toxin. 

The MUCl/Z and MUCl/W proteins may be used: 

a) for breast cancer diagnosis and prognosis, both in 
vivo and in vitro? 

b) for imaging cancer tissue; and 

c) for therapy of breast cancer patients. 

Breast Cancer Diacmosis and Procmosis 

As the MUCl/W and MUCl/Z proteins are synthesized by 
breast cancer tissue and are secreted from the cell, their 
serum levels can serve as markers for the disease. Assays 
employing antibodies directed against the MUCl/W and MUCl/Z 
proteins are used to analyze the serum levels of these 
proteins. This provides a means for diagnosing individuals 
with early breast cancer, and/or for monitoring the 
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progression of breast cancer in patients who already have 
been diagnosed. 

In general, ELISAs are the preferred immunoassays 
employed to assess the amount of the new proteins described 
and claimed herein present in a specimen. ELISA assays are 
well-known to those skilled in the art. Both polyclonal and 
monoclonal antibodies can be used in the assays. Where 
appropriate, other immunoassays, such as radioimmunoassays 
(RIA) can be used, as known to those skilled in the art. 
Available immunoassays are extensively described in the 
patent and scientific liter atxire. See, for example, U.S. 
Patents 3,791,932; 3,839,153; 3,850,752; 3,850,578; 
3,853,987; 3,867,517; 3,879,262; 3,901,654; 3,935,074; 
3,984,533; 3,996,345; 4,034,074 and 4,098,876, as well as 
Sambrook, et al., Molecular Cloning: A Laboratory Manual s 
Cold Spring Harbor Laboratory, New York, U.S.A. (1989), and 
E. Harlow and D. Lane, Antibodies: A Laboratory Manual , 
Cold Spring Harbor Laboratory, New York, U.S.A. (1988). 

Imaging of Breast Cancer Tissue 

The identification of sites in the body to which breast 
cancer cells have spread is of prime importance for the 
successful eradication of the disease* The MUCl/Z ligand 
specifically homes in onto breast cancer cells expressing 
the target MUCl/X, MUCl/Y and MUCl/v receptor molecules, 
providing the means for efficiently localizing cancerous 
tissue. Imaging is performed by tagging the MUCl/Z ligand 
with, for example, radioactivity, injecting the labelled 
MUCl/Z protein into the patient, and monitoring its 
localization within the body. 
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Therapy of Breast Cancer Patients with Ligand 

1, Ligand as a Drug-Delivery System 

Using the MUCl/Z ligand as a drug delivery system^ 
ligand-toxin conjugates are prepared, such as MUCl/Z fused 
to a cytotoxic toxin • 

The toxin thus specifically homes in onto the target 
breast cancer cell, which is then killed. Alternatively, 
the ligand is labelled with cytotoxic levels of 
radioactivity. The target breast Ccuicer cells are then 
directly eradicated by the radioactively- labelled ligand. 

2. Blockade of MUCl/X, MUCl/Y and MPCl/V Receptors without 
Receptor Activation 

By using defined regions of the ligand that only bind 
to the receptor, yet do not activate it, it is possible to 
effectively "swamp" the receptors present on the breast 
cancer cell with non-activating ligand. Receptor occupancy 
with non-activating ligands (antagonistic ligands) will 
preclude the binding of activating ligands, thereby limiting 
the growth of the breast cancer cell. 

The specification and claims provide guidance for the 
use of the invention in hximans. The Investigator's Handbook 
provided by the Cancer Therapy Evaluation Program, Division 
of Cancer Treatment, National Cancer Institute, U.S.A., 
indicates that the starting dose for Phase I trials is based 
on animal data such as rodent equivalent LDio- Further, the 
manual (page 22) indicates that animal studies carried out 
prior to Phase I trials provide the investigator with a 
prediction of the likely effects- [See also J.S. Driscoll, 
"The Preclinical New Drug Research Program of the National 
Cancer Institute," Cancer Treatment Reports , Vol. 68, 
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pp. 63-76 (1984).] Therefore, the data accumulated in a 
mouse model is not only acceptable in determining humam 
doses and protocols, but is considered highly predictive* 

The new MUCl proteins of the present invention, i.e., 
the proteins selected from the group of proteins consisting 
of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V, 
MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z and MUCl/Z/alt, as 
well as their functional derivatives as defined herein, eure 
prepared by recombinant DNA technology and polypeptide 
synthesis. 

Thus, the new MUCl proteins of the present invention 
are prepared by culturing a host cell transformed with an 
expression vector comprising DNA encoding an amino acid 
sequence of the new MUCl proteins in a nutrient medium, and 
recovering the new MUCl proteins from the cultiired broth. 

Particulars of the above-mentioned process are 
explained in detail below. 

The host cell may include a microorganism [bacteria 
(e.g., Escherichia coli , Bacillus subtilis , etc.); yeast 
(e.g., Saccharomyces cerevisiae , etc . ) ] , cultured human or 
animal cells (e.g., CHO cell, L929 cell, etc.), cultured 
plant cells, and cultured insect cells. Preferred examples 
of the microorganism include bacteria, especially a strain 
belonging to the genus Escherichia (e.g., E. coli HB--101, 
ATCC 33694; E. coli HB-IOl-16, FERM BP-1872; E. coli 294, 
ATCC 31446; E. coli X'1776, ATCC 31537, etc.); yeast, animal 
cell lines (e.g., mouse L929 cell, Chinese hamster ovary 
(CHO) cell, etc.), and the like. 
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vnien the bacterium, especially E. coli , is used as a 
host cell, the expression vector usually conqprises at least 
a promoter-operator region, initiation codon, DNA encoding 
the amino acid sequence of the new MUCl proteins, 
termination codon, terminator region, and replicatable unit. 
When yeast or an animal cell is used as host cell, the 
expression vector is prefercQ>ly composed of at least 
promoter, initiation codon, DNA encoding the amino acid 
sequence of the signal peptide and the new HUCl proteins, 
and termination codon, and it is possible that enhancer 
sequences, 5'- and 3 ' -noncoding region of the native MUCl 
proteins, splicing junctions, polyadenylation site and 
replicatable unit are also inserted into the expression 
vector - 

The promoter -operator region comprises promoter, 
operator and Shine -Dalgar no ( SD ) sequence (e.g., AAGG , 
etc.). Examples of the promoter -operator region include 
conventionally employed promoter-operator region ( e - g . , 
lactose-operon, PL-promoter, trp-promoter, etc.) and the 
promoter for the expression of the new MUCl protein in 
mammalian cells may include HTLV-promoter , SV40 early- or 
late-promoter, LTR-pr omoter , mouse metallothionein I (MMT)- 
promoter and vaccinia-promoter. 

Preferred initiation codon includes methionine codon 
(ATG) . 

The DNA encoding signal peptide includes the DNA 
encoding signal peptide of the new MUCl proteins. 

The DNA encoding the amino acid sequence of the signal 
peptide or the new MUCl proteins is prepared in a 
conventional manner, such as a partial or whole DNA 
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synthesis using DNA synthesizer and/or treatment of the 
complete DNA sequence coding for native or mutant MUCl 
proteins inserted in a suitable vector obtainable from a 
transformant or genome in a conventional manner (e.g.^ 
digestion with restriction enzyme^ dephosphorylation with 
bacterial alkaline phosphatase, ligation using T4 DNA 
ligase ) • 

The termination codon(s) include conventionally 
employed termination codon (e.g., TAG, TGA, etc.). 

The terminator region contains natural or synthetic 
terminator (e.g., synthetic fd phage terminator, etc.). 

The replicatable unit is a DNA sequence capable of 
replicating the whole DNA sequence belonging thereto in the 
host cells and includes natural plasmid, artificially 
modified plasmid (e.g., DNA fragment prepcured from natural 
plasmid) and synthetic plasmid, and preferred examples of 
the plasmid include plasmid pBR 322 or artificially modified 
plasmid thereof (DNA fragment obtained from a suitable 
restriction enzyme treatment of pBR 322) for E. coli ; 
plasmid pRSVneo ATCC 37198, plasmid pSV2dhfr ATCC 37145, 
plasmid pdBPV-MMTneo ATCC 37224, plasmid pSV2neo ATCC 37149 
for mammalian cell. 

The enhancer sequence includes the enhancer sequence 
(72 bp) of SV40. 

The polyadenylation site includes the polyadenylation 
site of SV40. 

The splicing junction includes the splicing junction of 

SV40. 
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The prOTioter-operator region, initiation codon, DNA 
encoding the amino acid sequence of the new MUCl proteins, 
termination codon(s) and terminator region are consecutively 
and circularly linked together with an adequate replicatable 
unit (plasmid) if desired, using adeqxiate DNA fragment! s) 
(e.g., linker, other restriction site, etc.) in a 
conventional manner (e.g., digestion with restriction 
enzyme, phosphorylation using T4 polynucleotide kinase, 
ligation using T4 DNA ligase) to give an expression vector. 
When mammalian cell line is used as a host cell, it is 
possible that enhancer sequence, promoter, 5'-noncoding 
region of the cDNA of the native MUCl proteins, initiation 
codon, DNA encoding amino acid sequences of the signal 
peptide and the new MUCl termination codon(s), 3'-noncoding 
region, splicing junctions and polyadenylation site are 
consecutively and circularly linked together with an 
adequate replicatable unit in the above manner. 

The expression vector is inserted into a host cell by 
methods known per se. The insertion is carried out in a 
conventional manner (e.g., transformation including 
transfection, microinjection, etc.) to give a transformant 
including transfectant. 

For the production of the new MUCl proteins in the 
process of the present invention, thus obtained transformant 
comprising the expression vector is cultured in a nutrient 
medium. 

The nutrient medium contains carbon source (s) (e.g., 
glucose, glycerine, mannitol, fructose, lactose, etc.) and 
inorganic or organic nitrogen source (s) (e.g., ammonium 
sulfate, ammonium chloride, hydrolysate of casein, yeast 
extract, polypeptone, bactotrypton , beef extracts, etc.). If 
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desired, other nutritious sources [e.g., inorganic salts 
(e.g., sodium or potassiiun biphosphate, dipotassium hydrogen 
phosphate, magnesium chloride, magnesixim sulfate, calcium 
chloride ) , vitamins (e.g., vitamin Bl ) , antibiotics (e.g., 
ampicillin), etc.] are added to the raediijun. For the culture 
of mammalian cell, Dulbecco's Modified Eagle's Minimum 
Essential Medium (DMEM) supplemented with fetal calf serum 
and an antibiotic is often used. 

The culture of trauisf ormant is generally be carried out 
at pH 5.5-8.5 (preferably pH 7-7.5) and 18-40*C (preferably 
25-38*C) for 5-50 hours. 

When a bacterium such as E, coli is used as a host 
cell, thus produced new MUCl proteins generally exist in 
cells of the cultiired transformant and the cells are 
collected by filtration or centrifugation, and cell wall 
and/or cell membrane thereof are destroyed in a conventional 
manner (e.g., treatment with supersonic waves and/or 
lysozyme , etc . ) to give debris . From the debris , the new 
HUCl proteins are purified and isolated in a conventional 
manner, as generally employed for the purification and 
isolation of natural or synthetic proteins [e.g., 
dissolution of protein with an appropriate solvent (e.g., 8M 
aqueous urea, 6M aqueous guanidixim salts, etc.), dialysis, 
gel filtration, column chromatography, high performance 
liquid chromatography, etc.]. When a mammalian cell is used 
as a host cell, the produced new MUCl proteins generally 
exist in the culture solution. The culture filtrate 
(supernatant) is obtained by filtration or centrifugation of 
the cultured broth. From the culture filtrate, the new MUCl 
proteins are purified in a conventional manner. 
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As will be realized^ having now identified the new MOCl 
proteins of the present invention, purified antibodies, both 
polyclonal and monoclonal, which specifically bind 
respectively to each of said proteins can be readily 
prepared by methods per se known in the art* Once said 
antibodies are prepared, they can be conjugated to a 
therapeutic drug or a detectable moiety and/ or bound to a 
solid support* 

The preparation of said antibodies also enables the 
carrying-out of a bioassay for determining the amount of a 
MUCl protein selected from the group consisting of MUCl/X, 
MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, 
MUCl/W/alt, MUCl/Z and MUCl/Z/alt or a functional derivative 
thereof devoid of a tandem repeat array, comprising (a) 
contacting the biological sample with an antibody under 
conditions such that a specific complex of the antibody and 
said MUCl protein can be formed; and (b) determining the 
amount of the antibody/MUCl protein complex, the amount of 
the complex indicating the amount of said MUCl protein in 
the biological sample, and allows the method of detecting 
the presence of a cancer in a subject comprising determining 
the presence of a detectable amount of said MUCl protein in 
a biopsy from the subject, the presence of a detectable 
amount of said MUCl protein relative to the absence of MUCl 
protein in a normal control indicating the presence of a 
cancer, and the method of determining the prognosis of a 
subject having cancer, comprising determining the presence 
of a detectable amount of said MUCl protein in a biopsy from 
the subject, the presence of a detectable amount of MUCl 
protein relative to the absence of said MUCl protein in a 
normal control indicating a decreased chance of long-term 
survival. 
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While the Invention will now be descrilDed in connection 
with certain preferred embodiments in the following examples 
and with reference to the following illustrative figures so 
that aspects thereof may be more fully understood and 
appreciated, it is not intended to limit the invention to 
these particular embodiments* On the contrary, it is 
intended to cover all alternatives, modifications eind 
equivalents as may be included within the scope of the 
invention as defined by the appended claims. Thus, the 
following examples which include preferred embodiments of 
the novel proteins, the functional derivatives thereof, the 
combination thereof with cytotoxic agents and detectably 
labelled markers, as well as the preparation of DNA 
constructs, vectors, and transfected hosts encoding and 
incorporating the same, and the various uses thereof, will 
serve to illustrate the practice of this invention, it being 
understood that the particulars shown are by way of example 
and for purposes of illustrative discussion of preferred 
embodiments of the present invention only and are presented 
in the cause of providing what is believed to be the most 
useful and readily iinderstood description of formulation 
procedures as well as of the principles and conceptual 
aspects of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 
In the drawings: 

Fig- lA is a scheme of alternative splice events (W, X, Y 
and Z) that delete the MUCl tandem repeat array and 
flanking sequences; 

Fig. IB is a scheme of alternative splice events (W, X, Y, 
and 2) and nucleotide sequence of the regions 5* 
flanking the AG consensus splice acceptor site; 
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Fig. 11 depicts the binding of tyrosine phosphorylated MUCl 

cytoplasmic domain to SH2 domains; 
Fig. 12 is a scheme depicting the repeat array containing 

MUCl protein (upper drawing) and the novel MUCl/Y 

protein (lower drawing); 
Fig. 13 is a scheme depicting the location of tyrosine and 

cysteine residues in the MliCl proteins; and 
Fig. 14 is a comparison sch^e of MUCl sequences and 

sequences known to interact with SH2 domains. 



DETAILED DESCRIPTION OF THE INVENTION 
With regard to the attached drawings, the following is 
a more detailed description thereof, so that the same can be 
more readily understood: 

Fig. lA : Scheme of alternative splice events (W^ X, Y 
and Z) that delete the MUCl tandem repeat array and fla nk i ng 
sequences. The MUCl genomic sequence is indicated by the 
continuous line. The various splice events (W, X, Y and Z) 
that delete the tandem repeat array are indicated. The 
dinucleotides at the splice donor and splice acceptor sites 
are indicated by GT and AG, respectively- The X and Y 
splices retain the same reading frame (RF) as the MUCl 
protein, whereas W and Z change the reading frame. The 
signal peptide and the transmembrane domains are indicated 
by SIG and TM, respectively. 

Fig. IB : Sch«iie of alternative splice events (W, X, Y 
and Z) and 5* sequences flanking the splice acceptor site. 
The pyrimidine-rich sequences 5' flanking the W, X, Y and Z 
splice acceptor sites are shown. Other symbols are as in 
Fig. lA. 
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Fiq> 2 ; Alternative IIIJCI N- terminal signal p^tide 
sequences. The amino terminal (N- terminal) amino acid 
sequence is presented using the one letter code. The lower 
sequence represents the N-^terminal sequence that includes an 
extra 9 amino acids (boxed sequence) that is generated by an 
alternative splice event. Numbers appearing above the amino 
acid sequence represent the probability (calculated 
according to the Von Heijne signal peptide cleavage rules; 
arbitrary units are used) of signal peptide cleavage 
occurring at that site . The upward-facing arrow represents 
the most likely site of signal peptide cleavage. 

Fig> 3 r Scheme of the repeat array containing MUCl 
protein (upper molecule) and the novel MUCl/Y, MUCl/X, 
MUCl/W and MQCl/Z proteins. The novel MUCl/Y, MUCl/X, 
MUCl/W and MUCl/Z proteins are generated by alternative 
splicing events that delete the central tandem repeat array 
( compare upper and lower molecules ) . All MUCl forms contain 
a hydrophobic N- terminal signal sequence (slashed box at 
left of figure) that is co-translationally cleaved (arrow at 
left of figure) • This is followed by the tandem repeat 
array (upper molecule) that is illustrated by the block of 
closely-spaced vertical lines. The highly hydrophobic 28 
amino acid stretch constituting the transmembrane domain 
(TM) is shown at the C- terminal end of both MUCl proteins, 
followed by the cytoplasmic domain (CYT). The region 
comprising the proteolytic cleavage site [Ligtenberg, et 
al., J> Biol, Chem> , ibid. (1992)] of the repeat array 
containing MUCl protein (upper molecule) is indicated by the 
two vertical dotted lines just N- terminal to the 
transmembreme domain. Potential N- linked glycosylation sites 
are shown with an asterisk { * ) - The W and 2 splice events 
alter the reading frame of the MUCl protein downstream to 
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their respective splice acceptor sites ^ and therefore 
contain downstream amino acid sequences that differ from the 
MUCl/Y and MUCl/X proteins. 

Fig, 4 ; Scheme of the rex>eat array containing MUCl/alt 
protein that ht^g the variant signal peptide at its 
N-terminal and the novel MDCl/Y/alt, MUCl/X/alt, MUCl/W/alt 
and MUCl/Z/alt proteins generated by alternative splicing. 
The altered N- terminal (see Fig. 2) resulting from the 
altered signal peptide is illustrated immediately distal to 
the slashed box at the N-terrainus. All the resulting novel 
MUCl/Y/alt^ MUCl/X/alt, MUCl/W/alt and MUCl/Z/alt proteins 
will accordingly have the variant N-terminus. Other symbols 
are as in Fig. 3. 

Fig. 5A : Amino acid sequence of the MUCl/X protein. 
The amino acid sequence (one letter code) is shown beneath 
the nucleotide sequence and begins with the initiating 
methionine residue • Numbering of the amino acid residues is 
shown to the right of the figure. 

Fig. SB : Amino acid sequence of the MUCl/X/alt 
protein* The amino acid sequence (one letter code) is shown 
beneath the nucleotide sequence and begins with the 
initiating methionine residue. Numbering of the amino acid 
residues is shown to the right of the figure. 

Fig, 6A : Amino acid sequence of the MUCl/Y protein. 
The amino acid sequence (one letter code) is shown beneath 
the nucleotide sequence and begins with the initiating 
methionine residue. Numbering of the amino acid residues is 
shown to the right of the figixre. 
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Fig. 2 shows amino terminal amino acid sequences of the MUCl 
proteins, demonstrating the two variant MUCX signal 
peptide forms and sites of signal peptide cleavage; 

Fig. 3 is a scheme of the repeat array containing MUCl 
protein (upper molecule) and the novel MUCl/W, MUCl/X, 
MUCIY and MUCIZ proteins generated by alternative 
splicing; 

Fig. 4 is a scheme of the repeat array containing MUCl/ alt 
protein that has the variant signal peptide at its N- 
terminal and the novel MUCl/Y/alt, MUCl/X/alt, 
MUCl /w/ alt and MUCl/ Z/ alt proteins generated by 
alternative splicing; 

Fig. 5A shows the amino acid sequence of the MUCl/X 
protein; 

Fig. 5B shows the amino acid sequence of the MUCl /X/ alt 
protein; 

Fig. 6 A shows the amino acid sequence of the MUCl/Y protein; 
Fig. 6B shows the amino acid sequence of the MUCl/Y/alt 
protein; 

Fig. 6C shows the amino acid sequence of the MUCl/V protein; 
Fig. 6D shows the amino acid sequence of the MUCl/V/alt 
protein; 

Fig. 7 A shows the amino acid sequence of the MUCl/W protein; 
Fig. 7B shows the amino acid sequence of the MUCl/W/alt 
protein; 

Fig. 8 A shows the amino acid sequence of the MUCl/ 2 protein; 
Fig. 8B shows the amino acid sequence of the MUCl /Z/ alt 
protein? 

Fig. 9 illustrates the overexpression of the novel MUCl/X^ 
MUCl/Y, and MUCl/V proteins in human breast cancer 
tissue and post-translational modification by 
phosphorylation ; 

Fig. 10 illustrates phosphorylation on tyrosine residues of 
the MUCl/Y protein; 



IVO 96/09502 



PCT/IB95/00627 



- 27 - 

Fig. 6B : Amino acid sequence of the MUCl/Y/alt 
protein* The amino acid sequence (one letter code) is shovm 
beneath the nucleotide sequence and begins with the 
initiating methionine residue. Numbering of the amino acid 
residues is shown to the right of the figure. 

Fig. 6C ; Amino acid sequence of the MDCl/V protein. 
The amino acid sequence (one letter code) is shown beneath 
the nucleotide sequence and begins with the initiating 
methionine residue* Numbering of the amino acid residues is 
shown to the right of the figure. 

Fig. 6D ; Amino acid sequence of the MUCl/V/alt 
protein. The amino acid sequence (one letter code) is shown 
beneath the nucleotide sequence auid begins with the 
initiating methionine residue. Numbering of the amino acid 
residues is shown to the right of the figure. 

Fig. 7A i Amino acid sequence of the MUCl/W protein. 
The amino acid sequence (one letter code) is shown beneath 
the nucleotide sequence and begins with the initiating 
methionine residue. Numbering of the amino acid residues is 
shown to the right of the figure. 

Fig. 7B : Amino acid sequence of the MUCl/W/alt 
protein. The amino acid sequence (one letter code) is shown 
beneath the nucleotide sequence and begins with the 
initiating methionine residue. Numbering of the amino acid 
residues is shown to the right of the figure. 

Fig. 8A : Amino acid sequence of the MUCl/2 protein. 
The amino acid sequence (one letter code) is shown beneath 
the nucleotide sequence and begins with the initiating 
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methionine residue. N\iinbering of the amino acid residues is 
shown to the right of the figure, 

Fi<y. 8B ; Amino acid sequence of the MUCl/Z/alt 
protein. The amino acid sequence (one letter code) is shovm 
beneath the nucleotide sequence and begins with the 
initiating methionine residue. Numbering of the amino acid 
residues is shown to the right of the figure. 

Fig. 9 ; overeatpression of the novel MUCl/X, HRJCl/Y and 
MUCl/V proteins in human breast cancer tissue and post- 
translational modification by phosphorylation. 

(A) Cell lysates prepared from breast cancer cells 
(lane 2), primary human breast cancer tissues from 3 
different patients (lanes 1, 4 and 5) and the adjacent 
normal breast tissues ( lanes 3 and 6 ) , were analyzed by 
SDS-polyacrylamide gel electrophoresis (SDS-PAGE), 
transferred to nitrocellulose and immunoblotted with a 
rabbit polyclonal antibody directed against the MUCl 
cytoplasmic domain- The regions of specific 
immunoreactivity are indicated by the 3 open arrows to the 
left of the figure. 

(B) The novel MUCl/Y protein may be post- 
translationally modified by phosphorylation. Radioactive 
inorganic phosphate (""^P) was added to stable Ras 

* transformed 3T3 cell transf ectants expressing the MUCl/Y 
protein and following a 5-hour incubation the cells were 
lysed. Cell lysates subjected to immunoprecipitation with 
either pre-immune serum or with immune serum generated 
against the 62 C -terminal amino acids of the MUCl 
cytoplasmic domain (lanes 1 and 2, respectively) were 
analyzed by SDS-PAGE, followed by autoradiography. The 
phosphorylated MUCl/Y protein is clearly visible in lane 2 
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(arrow to the right of the figure). Moleculsu: size 
standards are indicated at left of figures in kilodaltons. 

Fig. 10 ; Phosphorylation on tyrosine residues of the 
MUCl/Y protein. The inimunoprecipitated phosphorylated MUCl 
proteins [from lane 2 in Fig. 9(B)] were isolated from SDS- 
acrylamide (10%) gel and hydrolyzed in 6N HCl at 110 for 
1 hour. Labelled phosphoaminoacids (with added unlabelled 
internal phosphoandno acid markers) were analyzed by thin- 
layer high voltage electrophoresis, followed by 
Phosphoimager analysis. The position of migration of 
phosphoserine , phosphothreonine and phosphotyrosine are 
indicated by PS, PT and PY respectively, and inorganic 
phosphate is shown by Pi. 

Fig. 11 ; Binding of tyrosine phosphoirylated MUCl 
cytoplasmic domain to SH2 domains. 

(A) The complete 72 amino acid sequence of the human 
MUCl cytoplasmic domain is shown, using the one letter amino 
acid code. Indicated below this are changes in the mouse 
MUCl homologue. The 7 tyrosine residues in the cytoplasmic 
domain are highlighted with an asterisk, and likely sites of 
interaction between phosphotyrosine-containing peptide 
sequences (boxed regions within the cytoplasmic doniain amino 
acid sequence) and SH2 domain containing proteins (boxed at 
the bottom of the figure) are shown. The cysteine- 
containing sequence is circled at the N-terminal of the 
cytoplasmic domain. 

(B) Recombinant MUCl cytoplasmic domain was 
synthesized as a fusion protein with N-terminal DHFR protein 
(from Halobacterium) using the pET system. The gel purified 
recombinant protein was in-vitro tyrosine phosphorylated by 
incubation with gamma ^^P-ATP and highly purified EGF 
receptor (EGF-R) protein isolated from A431 cells. The 
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radioactively-labelled HUCl cytoplasmic domain was 
repurifled from a SDS-acrylamide (10%) gel and inciibated 
overnight at 4**C, with either GST (glutathione transferase) 
beads alone (lane 1),^ or with GST/GRB-2 fusion protein beads 
(GRB-2, lane 2). The beads were then extensively washed and 
labelled bound proteins analyzed by SDS-PAGE. Specific 
GRB-2 binding of labelled MUCl cytoplasmic domain is 
indicated by the arrow to the right of the figure. 

(C) Labelled tyrosine phosphorylated MUCl cytoplasmic 
domain, purified by SDS acrylamide (10%) gel, was incubated 
with agarose beads bound to src SH2 domain (src, lane 1), 
the C-terminal p85 phosphatidyl inositol (PI) 3' Icinase SH2 
domain (PI, lane 2), and the M- terminal phospholipase C 
gamma 1 SH2 domain (lip. C, lane 3) and analyzed as 
described above. Specific binding to the src and 
phospholipase C SH2 domains (lanes 1 and 3, respectively) is 
indicated by the arrow to the right of the figure. No 
binding was observed to the C-terminal pSS (PI) 3 'kinase SH2 
domain ( lane 2 ) * 

Fig. 12 : Schone showing the repeat array containing 
HUCl protein (upper drawing) and the novel HUCl/Y protein 
(lower drawing). The novel MUCl/Y form is generated by an 
alternative splicing event that deletes the central tandem 
repeat array (compare upper and lower molecules). Both MUCl 
forms contain a hydrophobic N-terminal signal sequence 
(slashed box at left of figure) that is co-translationally 
cleaved (arrow at left of figure). This is followed by the 
tandem repeat array (upper molecule) that is illustrated by 
the block of closely-spaced vertical lines. The highly 
hydrophobic 28 amino acid stretch constituting the 
transmembrane domain (TM) is shown at the C-terminal end of 
both MUCl proteins, followed by the cytoplasmic domain 
(CYT). The region comprising the proteolytic cleavage site 
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(Ligtenberg, et al,, J> Biol> Cheniw ibid. (1992)] of the 
repeat array containing MUCl protein (upper molecules) is 
indicated by the two vertical dotted lines just N-terminal 
to the transmembrane domain. The regions recognized by the 
amti-repeat and anti- cytoplasmic domain (anti-cyt) 
antibodies are indicated and potential N- linked 
glycosylation sites are shown with an asterisk { * ) . 

Fig. 13 ; Scheme showing the location of tyrosine and 
cysteine residues in the MOCl proteins* The location of 
tyrosine and cysteine residues are indicated above the 
rectangles by vertical lines and asterisks, respectively. 
Both MUCl forms contain a hydrophobic N-terminal signal 
sequence (slashed box at left of figure) that is co- 
trans lationally cleaved (arrow at left of figure). This is 
followed by the tandem repeat array (upper molecule) that is 
illustrated by the block of closely-spaced vertical lines. 
The highly hydrophobic 28 amino acid stretch constituting 
the transmembrane domain (TM) is shown at the C- terminal end 
of both MUCl proteins, followed by the cytoplasmic domain 
(CYT). The region comprising the proteolytic cleavage site 
[Ligtenberg, et al., J. Biol. Chero. , ibid. (1992) 1 of the 
repeat array containing MUCl protein (upper molecule) is 
indicated by the two vertical dotted arrows just N-terminal 
to the transmembrane domain. The regions recognized by the 
anti -cytoplasmic domain (anti-cyt) antibodies are 
indicated. 

Fig. 14 : Phosphotyrosine-Containing Peptide Sequences 
Recognized by SH2 Domains and Their Comparison with MUCl 
Cytoplasmic Domain Sequences. The seq[uence specificity of 
the peptide-binding sites of SH2 domains has been previously 
determined using a phosphopeptide library [Songyang, et al.. 
Cell, Vol. 72, pp. 757-778 (1993)1 and the data presented in 
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this Figure are in part from Table 3 of that reference. The 
preferred amino acids 1, 2 and 3 residues C-terminal to 
phosphotyrosine are indicated in the columns labelled 
pY + 1, pY + 2 and pY + 3. The top line in each group 
relates to the most preferred sequence, with lowered 
preferences in the second and third lines. The boxed 
sequences correlate best with MUCl cytoplasmic domain 
sequences that are indicated in the right-h£Lnd colTimn. 

Experimental work detailed below has unequivocally 
demonstrated that: 

a) the MUCl/X, MOCl/Y and MUCl/V proteins are highly and 
differentially expressed in breast cancer tissue as compaured 
to normal breast tissue [see Fig. 9); 

b) the MUCl/X, MUCl/Y and MUCl/V proteins are extensively 
phosphorylated [see Fig- 9]; 

c) phosphorylation occurs almost exclusively on tyrosine 
residues [see Fig. 10]; 

d) the phosphorylated MUCl/X, MUCl/Y and MUCl/V proteins 
interact specifically with the SRc-homology (SH) domain SH2- 
and SH3 -containing proteins, GRB-2, SRC and phospholipase C 
gamma-1 [Fig. 11]; and 

e) the MUCl/X, MUCl/Y and MUCl/V proteins potentiate the 
transformed phenotype of cells and significantly enhance the 
in-vivo tximorigenic potential of mammary epithelial cells* 

This experimental data demonstrates that the MUCl/X, 
MUCl/Y and MUCl/V proteins function as cell surface receptor 
molecules participating in signal trfiinsduction , and are 
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intimately related to the development of human breast 
cancer » 

To assess expression of the MUCl proteins in-^vivo, 
extracts of human tissue samples were run on SDS denaturing 
gels^ transferred and probed with polyclonal cintibodies 
directed against the MUCl cytoplasmic domain. TOialyses were 
performed on malignant breast tumor tissue samples [Fig. 9A, 
lanes 4 cind 5 ] , together with extracts from breast tissue 
adjacent to the biopsied txmor sample [Fig. 9A, lanes 3 and 
6]. Little or no specific iramunoreactivity was observed in 
the non-malignant breast tissue samples [Fig. lanes 3 

and 6]. 

In marked contrast thereto, proteins specifically 
reactive with the anticytoplasmic domain antibodies were 
highly expressed both in breast cancer cells grown in-vitro 
and in the primary breast cancer tissue samples [Fig. 9A, 
lanes 2, 4 and 5 respectively]. 

The immunoreactive proteins migrated to distinct 
positions correlating to molecular masses of approximately 
25-30, 35 [in the in-vitro grown breast cancer cells, lane 
2], and 40-43 kDa. Some of these immunoreactive proteins 
may be generated by proteolytic cleavages occurring on the 
large polymorphic tandem repeat array containing MUCl 
protein at positions N-terminal to the transmembrane domain 
[Fig. 12, upper molecule, the two dotted arrows just 
terminal to the transmembrane domain]. However, the 
MUCl/X, MUCl/Y proteins [Fig. 12, lower molecule], and 
MUCl/V proteins are also likely represented by one or more 
of these immunoreactive proteins* In distinguishing between 
these possibilities, we were considerably aided by the 
identification of a third breast tumor tissue sample 
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[Fig. 9, lane 1], that expresses specific amticytoplasmic 
domain immunoreactive proteins with molecular masses of 
approximately 40-43 kDa and 35 kDa [compare Fig, 9, lanes 1 
and 2 1 . Probing an identical iimminoblot with monoclonal 
antibodies that recognize an epitope contained within the 
tcuidem repeat array, showed high levels of expression of the 
large polymorphic MUCl proteins in the breast cancer cell 
samples correlating to lanes 2, 4 and 5 - no immunereactive 
proteins corresponding to the large polymorphic MUCl 
proteins were detected in the third breast tumor correlating 
to lane 1 [data not shown]. These data suggest therefore 
that this third breast tumor tissue solely expresses the 
MUCl/X, MUCl/Y and MUCl/V protein forms and thereby indicate 
that the 35 and 40-43 kDa immunoreactive proteins are in 
fact the !fUCl/X and MUCl/Y proteins. 

Tyrosine Phosphorylation of the MUC1/X> MUCl/Y and 
MUCl/V Proteins 

The calculated molecular mass of the MUCl/Y protein, as 
determined by its primary amino acid sequence, is 25,986 
Daltons. An increase in the molecular mass of the MUCl/Y 
protein [to 35 and 40-43 kDa proteins] may occur by 
post-translational modifications such as glycosylation 
and/ or phosphorylation. To investigate whether the MUCl/Y 
protein is phosphorylated, radioactively- labelled inorgeuiic 
phosphate was added to stable transf ectcints expressing the 
MUCl/Y protein, and cell ly sates were subjected to anti-MUCl 
cytoplasmic domain immxinoprecipitation. 

Specifically immunoprecipitated MUCl/Y protein migrated 
with a molecular mass of 40-43 kDa, and demonstrated a 
prominent signal [Fig. 9B, lane 2], indicating that the 
40-43 kDa MUCl/Y proteins [Fig. 9] are phosphorylated 
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proteins. A phosphoamlno acid analysis performed on the 
isolated phosphorylated MUCl/Y protein shows that greater 
than 90% of the phosphorylation occurs on tyrosine residues, 
with much reduced levels of phosphoserine and almost 
undetectable levels of threonine phosphorylation [Fig. 10]. 

Considering that within the cell greater than 99% of 
total protein phosphorylation occurs solely on serine and 
threonine residues, the almost exclusive tyrosine 
phosphorylation of the MUCl/Y protein is especially 
striking. Phosphorylated tyrosine residues play a pivotal 
role in signal transduction pathways [M.J. Pazin and L.T. 
Williams, ibid- (1992)] as, for example, those initiated by 
growth factor receptors such as epidermal growth factor 
receptor (EGF-R), platelet derived growth factor receptor 
(PDGF-R), colony stimulating factor-1 receptor (CSFl-R), 
etc. This suggests therefore, that the extensively tyrosine 
phosphorylated MUCl/Y protein may also be performing an 
important signal- transducing fxinction. 

MUCl/Y Protein Interaction With SH2 Domain Proteins 

Analysis of the MUCl proteins demonstrates the 
following features: 

1) biased localization of tyrosine residues in the 
cytoplasmic domain and sequences N- terminal to it [Fig. 13]; 

2) all tyrosine residues within the polymorphic MUCl 
proteins are retained in the MUCl/X, MUCl/Y and 
MUCl/V proteins [Fig. 13]; 
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3) extensive similarity between the human and mouse HUCl 
proteins within the amino acid HUCl cytoplasmic domain 
[Fig. 11]; and 

4) marked similarity between tyrosine-containing sequences 
located within the MUCl cytoplasmic domain and 
phosphotyrosine-containing peptide sequences that are 
recognized by SH2 domain-containing proteins [Fig. 11]. 

Bearing in mind that the MUCl/X, MUCl/Y and MUCl/V 
proteins are extensively phosphorylated on tyrosine 
residues, these remarkable features indicate that the 
MUCl/X, MUCl/Y and MUCl/V proteins act as receptor -like 
molecules that participate in signal transduction. Thus, it 
is now believed that the cytoplasmic domain of the MUCl/X, 
MUCl/Y and MUCl/V proteins acts as a "surrogate" kinase 
insert, in a way similar to CD19 [D*A. Tuveson, et al., 
"CD19 of B Cells as a Surrogate Kinase Insert Region to Bind 
Phosphatidylinositol 3 -Kinase," Science , Vol, 260, pp- 
986-988 (1993)], and undergoes transphosphorylation on 
tyrosine residues by other activated tyrosine kinases with 
which it may specifically interact. This then forms a 
signalling complex composed of the phosphorylated MUCl/X, 
MUCl/Y and MUCl/V proteins and SH2 domain-containing 
proteins [C.A. Koch, et al., "SH2 and SH3 Domains: Elements 
that Control Interactions of Cytoplasmic Signalling 
Proteins," Science , Vol- 252, pp. 668-674 (1991)1, thereby 
initiating signal transduction. 

To test whether the cytoplasmic domain of the MUCl/Y 
protein has the potential to interact specifically with SH2 
domain-containing proteins, recombinant MUCl cytoplasmic 
domain was synthesized and radioactive ly phosphorylated on 
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its tyrosine residues with highly pxirified epidermal growth 
factor receptor (EGF-R). Incubation of the phosphorylated 
HUCl cytoplasmic domain with either glutathione transferase 
(GST) alone, or with Growth Factor Receptor Binding 
Protein 2 [B.J. Lowenstein, et al., "The SH2 and SH3 
Domain-Containing Protein GRB2 Links Receptor Tyrosine 
Kinases to Ras Signalling," Cell , Vol. 70, pp. 431-442 
(1992)] /GST (GRB-2/GST) fusion protein bound to agarose 
beads, demonstrated marlced binding to the GRB-2 protein 
[Fig. IIB]. Analysis of the MUCl cytoplasmic domain amino 
acid sequence [Fig. IIA and Fig. 14] indicates that it may 
also interact with additional SH2 domain-containing 
proteins . 

Further experimentation demonstrated that purified, 
recombinant MUCl cytoplasmic domain protein that had been 
phosphorylated on its tyrosine residues specifically bound 
to the SRC SH2 domain and to the SH2 domain derived from the 
N-terminal part of the phospholipase C gamma 1 protein 
[Fig. lie, lanes 1 and 3]. Under identical conditions, no 
binding was observed to the C- terminal p85 
phosphatidylinositol (PI) 3* Icinase SH2 domain. 

To validate in the in-vivo situation, findings that 
demonstrate in-vitro interactions of the MUCl/Y protein with 
multiple SH2 domain-containing proteins and, in particular, 
with the GRB-2 protein, human breast cancer tissue cell 
lysates were prepared and incubated with either GST 
(glutathione transferase) beads alone, or with GST/GRB-2 
fusion protein beads. Bound proteins were analyzed by SDS 
gel electrophoresis, transferred and subjected to probing 
with anti-MUCl cytoplasmic domain antibodies. The MUCl/Y 
protein was detected only in the sample that had been 
incubated with the GST/GRB-2 fusion protein l>eads. 



WO9M3502 



PCT/IB95m627 



- 38 - 



indicating that in the in-vivo situation the MUCl/Y protein 
potentially interacts with GRB-2 protein. 

MUCl/X, MUCl/Y and MUCl/V Protein Expression Taters Cell 
Morphology and Increases Tumorigenic Potential 

As the GRB-2 protein plays a key role in connecting 
tyrosine kinase receptors with the ras signal transduction 
system [E.J. Lowenstein, et al., ibid. (1992)], and as shown 
above, the MUCl/Y proteins contact the GRB-2 protein, the 
effect of MUCl/Y protein expression on the morphology of ras 
transformed 3T3 fibroblasts was investigated. Transfectants 
were generated from ras transformed 3T3 fibroblasts with the 
neomycin resistance gene alone, and in combination with an 
expression vector harboring cDNA coding, for either the 
MUCl/Y proteins or the large tandem repeat array containing 
MUCl protein. The parental ras transformed 3T3 fibroblasts, 
and control cells transfected only with the neomycin 
resistance gene, grew mostly in foci and cell clusters. As 
previously reported, transfectants expressing the large 
tandem repeat array containing MUCl protein displayed 
decreased cellular aggregation and did not grow in foci; 
this is likely due to the known anti-adhesive properties of 
the tandem repeat array containing MUCl protein. The effect 
of MUCl/Y protein expression on cell morphology was, 
however, immediately apparent. These transfectants 
displayed a marked increase in the number of foci, an 
altered phenotype that was observed in all independent 
MUCl/Y protein-expressing transfectants analyzed. This is 
indicative of the fact that expression of the MUCl/Y protein 
is indeed potentiating the transforming potential of the 
cell. 
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Next, tests were conducted to determine whether MUCl/Y 
protein expression alters the tximorigenic potential of 
mammary epithelial cells. Transfectants were generated 
using the DA3 mouse mammary epithelial cell line, derived 
from a DMBA-induced mouse mammary carcinoma, and expression 
of the MUCl/Y protein in the transfectants was assessed by 
Western blotting. Positive HUCl/Y transfectants, as well as 
tandem repeat array containing NUCl transfectants and 
control neomycin transfectants, were injected 
intramuscularly into female Balb/c mice at three different 
cell concentrations (S-IO^*, 10* and 5.10*) and the mice were 
monitored for tumor development. 

Mice injected with transfectants expressing the tandem 
repeat array containing MUCl protein, or with the control 
neomycin transfectants, showed similar patterns of tumor 
development. In marked contrast however, tumors developed 
rapidly in the MUCl/Y transfectant group and preceded the 
appearance of tiomors in the other two groups by weeks to 
months, at all cell concentrations tested. For example, 
tiimors developed in all mice (5 per group) injected with the 
MUCl/Y transfectant (5.10* cells per mouse) only 7 days 
following injection. Animals injected with the control 
neomycin transfectants showed tumor development in three out 
of five mice that were first observed 6 weeks following 
injection. This pattern of increased tumorigenicity of the 
MUCl/y transfectants was consistently observed at all other 
cell concentrations tested - 

The experimental work described above demonstrates that 
the MUCl/Y proteins are highly expressed in human breast 
cancer tissue; are extensively phosphorylated on tyrosine 
residues; interact specifically with the SRC homology domain 
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(SH2) containing proteins GRB-2, SRC and phospholipase C 
gainma-l; emd increase cellular ttimorigenic potential. 

As is seen from the structure of the MUCl/X molecule, 
it is highly similau: to the MUCl/Y molecule, except for the 
insertion of 18 amino acids between amino acid residue 
numbers 53 and 54 in the MOCl/Y sequence. The MUCl/X 
protein is therefore believed to function as a receptor 
molecule in a similar fashion to the MUCl/Y protein, 
although its affinity for ligand may differ. This is also 
true for the /alt configurations of MUCl/Y and MUCl/X. 

As is seen from the structure of the MUCl/V molecule, 
it is highly similar to the MUCl/Y molecule. The MUCl/V 
protein is therefore believed to function as a receptor 
molecule in a similar fashion to the MUCl/Y protein, 
although its affinity for ligand may differ. This is also 
true for the /alt configurations of MUCl/Y, MUCl/X and 
MUCl/V. 

Taken together, the above data indicate that the 
MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V and 
MUCl/V/alt proteins act as signal-transducing receptor-like 
molecules that form a signalling complex which is inrimately 
related to the oncogenetic process. 

The MUCl/X, MUCl/Y and MUCl/V proteins are, however, 
different from classical receptor tyrosine kinases, in that 
they do not contain a catalytical tyrosine kinase domain. 
One of the postulates of the present hypothesis is that the 
cytoplasmic domains of the MUCl/X, MUCl/Y and MUCl/V 
proteins undergo transphosphorylation in a manner similar to 
that recently described for the B cell CD19 molecule [D.A. 
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Tuveson, et al., ibid- (1993)1 and for other cytokine 
receptors* 

Having identified the MUCl/X, MUCl/Y and MUCl/V 
receptors, it is now possible to prepare functional 
derivatives thereof, including purified receptors in soluble 
form. 

Thus, e.g. by deleting sequences downstream from 
glycine amino acid number 173 in the MUCl/X sequence 
[Fig. 5A] or glycine amino acid number 155 in the MUCl/Y 
sequence [Fig, 6A], or glycine amino acid number 140 in the 
MUCl/V sequence [Fig- 6C], one produces truncated forms of 
the one produces trxincated forms of the membrane receptors, 
which lack trainsmembrane and intracytoplasmic domains, but 
retain the ligand-binding extracellular portion. The 
affinities of soluble receptors for their ligands are 
comparable to those of the membrane receptors^ and thus said 
soluble receptors can compete with the membrane bound 
receptors and inhibit binding of ligands to the cell and the 
resulting activation thereof. 

Furthermore, with the molecular characterization of the 
MUCl/X, MUCl/Y and MUCl/V receptor molecules described 
herein, one can design drugs that will specifically interact 
with these receptors. These drugs may then be used to 
target breast cancer cells, either for imaging or 
therapeutic purposes. 

Additionally, as receptor molecules are known to be 
shed off from cells into the peripheral circulation, assays 
employing antibodies directed against the MUCl/X, MUCl/Y and 
MUCl/V receptors can be developed to analyse the serum 
levels of these receptors. The seriim concentrations of 
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these proteins, which, as previously described, axe 
expressed at high levels in breast cancer cells, may provide 
a means for diagnosing individuals with early breast cancer 
and/or for monitoring the progression of breast cancer in 
patients who have already been diagnosed. 

Based on the teachings of the present invention, these 
and other uses of the soluble receptors of the present 
invention will be clear to persons skilled in the art, and 
this especially in the light of the description and use of 
other soluble receptors in the literatxire [see, e.g., 
R. Fernandez-Botran, The FASEB Journal , Vol. 5, pp. 2567- 
2574 (1991) and S. Chamow, Int. J> Cancer , Supplement 7, 
pp. 69-72 (1992)]. 

Liqands 

Receptor molecules, such as the MUCl/X, MUCl/Y and 
MUCl/V proteins, specifically bind ligands. The MUCl/Z 
protein is secreted from the cell [Figs- 3 and 4] and, as 
detailed below, functions as a ligand for the MUCl/X, MUCl/Y 
and MUCl/V receptor proteins. The MUCl/W protein is 
believed to have a similar ligand function, based on its 
structure. This is also true for the /alt configurations of 
MUCl/Z and MUCl/W. 

By using antibodies generated in rabbits directed 
against MUCl/Z, we have unequivocally showed that the MUCl/Z 
protein is synthesized in breast tumor tissue, but not by 
normal breast tissue, and that it migrates in 
SDS-polyacrylamide gels with an apparent molecular mass of 
approximately 25 kDa. Binding of the 25 kDa protein to 
anti-MUCl/Z antibodies could be specifically competed out by 
the addition of bacterial recombinant MUCl/Z protein. 
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thereby confirming the identity of the 25 kDa protein as the 
MUCl/z protein. 

Investigation of the amino acid sequence of the MUCl/Z 
protein revealed several interesting features. 

First, as the MUCl/Z protein contains a signal 
sequence, but does not harbour a transmembrane domain, it is 
expected to be secreted from the cell. 

Second, an outstanding feature of the MUCl/Z protein is 
the tryptophan- tryptophan (WW) sequence, localized just 
proximal to the C- terminal part of the protein [amino acid 
numbers 93 and 94 in the MUCl/Z sequence (Fig. 8A) and amino 
acid numbers 102 and 103 in the MUCl/Z/alt sequence (Fig. 
8B ) ] . This is xmusual in that tryptophan is the least 
frequently occurring amino acid in proteins. A computer 
search for other proteins containing WW sequences revealed 
that the cell surface receptor for calcitonin contains the 
sequence GQRLWWYH, which is, strikingly, almost 
identical to the MUCl/Z sequence GQDLWWYN [amino acid 
niambers 89 to 96, Fig. BA]. Such an occurrence of amino 
acid identity would occur at a probability of less than 1 in 
64 million- This suggests, therefore, that the MUCl/Z 
protein is in some way involved with cell surface receptor 
interactions . 

Third, the MUCl/Z protein sequence contains several 
features that are found in other known ligands. For 
example, human epidermal growth factor (EGF) contains the 
sequence D L K W W and a similar sequence, D L W W appears 
in the MUCl/Z protein. Significantly, the location of this 
sequence is in both proteins identical, and occurs just 
proximal to the carboxyl-termintis of the protein. 
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Foiirth^ a highly-conserved sequence, consisting of 
CXCXXXXXG and which occurs in all growth factor 
ligand members, appears in the MUCl/Z protein [amino acid 
numbers 70 to 78, Fig. 8A1. 

Fifth, the MUCl/2 protein also contains several peptide 
sequences which are found in members of the prolactin/ growth 
hormone family, such as prolactin, pro lifer in, and growth 
hormone • 

Taken together, the above considerations all support 
the present finding that the MUCl/Z protein acts as a ligand 
for the MUCl/Y receptor protein. 

The following experiments further support the above 
contention. The extracellular domain of the MQCl/Y receptor 
protein was synthesized as a recombinant bacterial protein 
and then purified and radioactively labelled, and then was 
used to probe Western blots containing proteins found in 
breast tiunor tissue lysates. The labelled MUCl/Y receptor 
protein specifically boiind to a 25 kDa protein that 
comigrated with the MUCl/Z protein; this protein was present 
in breast tumor tissue lysates, yet was absent in 
normal breast tissue- Furthermore, in different cell 
types and tissues, the levels of the MUCl/2 protein 
directly correlated with the levels of the 25 kDa protein 
that binds the MUCl/Y receptor protein. 

The MUCl/Z protein is therefore the ligand of the 
MUCl/X, MUCl/Y and MUCl/V receptor proteins. This is true 
also for MUCl/2/alt. 

MUCl/W and MUCl/W/alt also contain a signal sequence 
and do not have a transmembrane domain. They are thus 



wo 9^3502 



PCT/IB95«M)627 



- 45 - 

secreted from the cell and, based on their structure, 
function as ligands in a similar fashion to the MUCl/Z and 
MUCl/Z/alt proteins • 

In the method of the present invention, the new MUCl 
proteins described and claimed herein can be administered in 
various ways. It should be noted that these new MUCl 
proteins can be administered alone ^ or in combination with 
pharmaceutically acceptable carriers . Compositions 
according to the present invention can be administered 
orally or par enter ally, including intravenous, 
intraperitoneal, intranasal and subcutaneous administration. 
Implants of the compounds are also useful. The patient 
being treated is a warm-blooded animal, and in particular, 
mammals including man. 

The proteins of the present invention are administered 
in combination with other drugs, or singly, consistent with 
good medical practice • The composition is administered and 
dosed in accordance with good medical practice, taking into 
accoimt the clinical condition of the individual patient, 
the site and method of administration, scheduling of 
administration, and other factors known to medical 
practitioners. The "effective amount" for purposes herein 
is thus determined by such considerations as are known in 
the art. 

When administering the new MUCl proteins par enter ally, 
the pharmaceutical formulations suitable for injection 
include sterile aqueous solutions or dispersions and sterile 
powders for reconstitution into sterile injectable solutions 
or dispersions. The carrier can be a solvent or dispersing 
medium containing, for example, water, ethanol, polyol (for 
example, glycerol, propylene glycol, liquid polyethylene 
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glycol, and the like), siiitable mixtures thereof, and 
vegetable oils. 

Proper fluidity can be maintained, for example, by the 
use of a coating such as lecithin, by the maintenance of the 
required particle size in the case of dispersion, and by the 
use of surfactants. Non-aqueous vehicles such as cottonseed 
oil, sesame oil, olive oil, soybean oil, corn oil, s\inf lower 
oil, or peanut oil and esters, such as isopropyl myristate, 
may also be used as solvent systems for compound 
compositions- Additionally, various additives which enhance 
the stability, sterility, and isotonicity of the 
compositions, including antimicrobial preservatives, anti- 
oxidants, chelating agents, and buffers, can be added. 
Prevention of the action of microorganisms can be ensured by 
various antibacterial and antifungal agents, for example, 
parabens, chlorobutanol, phenol, sorbic acid, and the like. 
In many cases, it will be desirable to include isotonic 
agents, for example, sugars, sodium chloride, and the like. 
Prolonged absorption of the injectable pharmaceutical form 
can be brought about by the use of agents delaying 
absorption, for example, aluminum monostearate and gelatin. 
According to the present invention, however, any vehicle, 
diluent or additive used would have to be compatible with 
the compounds. 

sterile injectable solutions can be prepared by 
incorporating the proteins utilized in practicing the 
present invention in the required amount of the appropriate 
solvent with various of the other ingredients, as desired - 

A pharmacological formulation of the new MUCl proteins 
described and claimed herein can be administered to the 
patient in an injectable fonnulation containing any 
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compatible carrier , such as various vehicle, adjuvants, 
additives, and diluents; or the compounds utilized in the 
present invention can be administered parenterally to the 
patient in the form of slow-release siibcutaneous implants or 
targeted delivery systems, such as polymer matrices, 
liposomes, and microspheres. An implauat siiitable for use in 
the present invention can take the form of a pellet which 
slowly dissolves after being implanted, or a biocompatible 
delivery module well-known to those skilled in the art. 
Such well-known dosage forms and modules are designed such 
that the active ingredients are slowly released over a 
period of several days to several weeks. 

Examples of well-known implants and modules useful in 
the present invention include: U.S. Patent ,No. 4,487,603, 
which discloses an implantable micro- infusion pump for 
dispensing medication at a controlled rate; U.S. Patent 
No. 4,486,194, which discloses a therapeutic device for 
administering medicamts through the skin; U.S. Patent No. 
4,447,233, which discloses a medication infusion pump for 
delivering medication at a precise infusion rate; U.S. 
Patent No. 4,447,224, which discloses a variable flow, 
implantable infusion apparatus for continuous drug delivery; 
U.S. Patent No. 4,439,196, which discloses an osmotic drug 
delivery system having multi-chamber compartments; and U.S. 
Patent No. 4,475,196, which discloses an osmotic drug 
delivery system. These patents are incorporated herein by 
reference. Many other such implants, delivery systems, and 
modules are well-known to those skilled in the art. 

A pharmacological formulation of the new MUCl proteins 
utilized in the present invention can be administered orally 
to the patient. Conventional methods such as administering 
the compounds in tablets, suspensions, solutions, emulsions. 
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capsules, powders, syrups and the like, are xisable. Known 
techniques which deliver the new MUCl proteins orally or 
intravenously and retain the biological activity, are 
preferred. 

In one embodiment, the new MUCl proteins can be 
administered initially by intravenous injection to bring 
blood levels of the new MUCl proteins to a sxxi table level. 
The patient's MOCl protein levels are then maintained by an 
oral dosage form, although other forms of administration, 
dependent upon the patient's condition and as indicated 
above, can be used. The quantity of the new MUCl proteins 
to be administered will vary for the patient being treated, 
and will Vciry from about 100 ng/kg of body weight to 
100 mg/kg of body weight per day, and preferably will be 
from 10 y.g/kg to 10 mg/kg per day. 

EXAMPLE 1 

ImmunoassaYs for Detecting and Quantitating the New MUCl 
Proteins in Body Fluids 

To detect and quantitate the new MUCl proteins in body 
fluids such as, for example, serum, one of the most useful 
methods is the two- antibody sandwich assay [see E. Harlow 
and D- Lane, ibid.. Chapter 14, "Immunoassays," pp. 553-612 
(1988)], 

Both polyclonal and monoclonal antibodies are prepared 
against the new MUCl proteins. To use the two-antibody 
assay, one antibody is purified and bound to a solid phase, 
and one of the new MUCl proteins which is to be assayed is 
allowed to bind. Unboxmd proteins are removed by washing 
and the labelled second antibody is allowed to bind to the 
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antigen. After washing, the assay is quantitated by 
measxiring the amount of labelled second antibody that is 
bound to the matrix and a calibration curve is established 
for the specific new MUCl protein which was assayed. 

To assay for the presence of the new MUCl proteins in 
body fluids, the above assay is repeated, using as test 
antigen a sample of the body fluid. 

EXAMPLE 2 

Iimmmohistochemical Staining for the Detection of the Wew 
MPCl Proteins in Tissue Sections 

Histological studies for the detection of the new MUCl 
proteins are carried out on pcorafonnaldehyde- fixed, 
paraffin-embedded tissue samples - 

The cells or tissues are fixed to the glass slides and 
permeabilized using standard procedures as described in £. 
Harlow and D. Lane, ibid.. Chapter 10, "Cell Stadning," pp. 
359-420 (1988). The antibodies against one of the new MUCl 
proteins are then added to the fixed and permeabilized cells 
or tissues. As in many other immuno-chemical techniques, 
the antibodies can be labelled directly either with an 
enzyme, f luorochrome, etc., or detected by using a labelled 
secondary reagent that binds specifically to the primary 
antibody. 
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EXAMPLE 3 

In-Vivo iroaqinq of Breast Cancer Cells with Labelled Ljg^^nds 
that Bind to the New MPCl Receptor Prot-g^ing 

The MUCl/Z, MUCl/Z/alt, MUCl/W and MUCl/W/alt ligand 
proteins are used to tcurget and thereby image breast cancer 
cells in the living body. These ligand molecules are 
radioactively labelled with, for example, radioactive iodine 

using, for example, the Bolton-Hunter reagent 
IcdDelled N-succinimidyl 3-(4-hydroxy-phenylpropionate) ] • 

An 0.5-1 rog/ml solution of the new MUC12 ligand 
proteins is prepared in 0*1 M sodium borate (pH 8.5) and 
transferred to ice. Approximately 500 microciirie of Bolton- 
Hunter reagent is transferred to a 1.5 ml conical tube at 
O^C and the reagent is dried in a stream of dry nitrogen 
gas. About 10 microliters of the protein solution is added 
to the dry Bolton-Hunter reagent, mixed gently and returned 
to the ice. Following incubation on ice for 15 minutes, a 
stop solution consisting of 100 microliters of 0.5 M 
ethanolamine, 10% glycerol, 0.1% xylene cyanol, 0.1 M sodium 
borate (pH 8.5) is added and incubated for 5 min. at room 
temperature. The radioactively iodinated MUCl/Z, 
MUCl/Z/alt, MUCl/W and MUCl/W/alt ligand proteins are then 
separated from the iodinated Bolton-Hunter reagent on a 
gel-filtration column - 

To image breast cancer cells in vivo, the labelled 
ligand molecules are injected intravenously into the 
patient, and the distribution of the radioactively labelled 
molecules is monitored using radioactive imaging devices. 
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EXAMBUB 4 

Liqand as a Drug Delivery System for Ligand-Toxin Coniugates 
The MOCl/Z, MUCl/Z/alt, MUCl/W and MUCl/W/alt ligand 
proteins are conjugated to cytotoxic siibstances and thereby 
used as drug delivery systems to target and kill breast 
cancer cells within the body. Several cytotoxic substcuices 
for conjugation may be used, including cytotoxic proteins 
such as pseudomas exotoxin A and ricin [I. Pastan and D. 
Fitzgerald, "Recombinant Toxins for Cancer Treatment," 
Science . Vol. 254, pp- 1173-1177 (1991)1 or cytotoxic levels 
of radioactivity. 

Conjugation of the new MUCl proteins to cytotoxic 
proteins is performed by any of a number of coupling 
procedures, including glutaraldehyde coupling and periodate 
coupling. 

In the two-step glutaraldehyde method, glutaraldehyde 
is first coupled to the pure cytotoxic protein via the 
reactive amino groups available on the protein- The 
cytotoxic protein-glutaraldehyde mix is then purified and 
added to the MUCl/2, MUCl/Z/alt, MUCl/W, and MUCl/W/alt 
ligand proteins. Unconjugated material is then separated 
from the cytotoxic protein/new MUCl protein conjugate. 

The cytotoxic protein is dissolved in 0.2 ml of 1.25% 
glutaraldehyde (electron microscopic grade) in 100 iilM sodium 
phosphate (pH 6.8). After 18 hoxirs at room temperature, 
excess free gluaraldehyde is removed by gel filtration on a 
gel matrix that is pre-equilibrated with 0.15 M NaCl. The 
peak fractions containing the glutaraldehyde- linked 
cytotoxic protein are concentrated by ultrafiltration or by 
dialysis against 100 mM sodium carbonate-sodium bicarbonate 
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buffer (pH 9.5) containing 30% sucrose. The new HUCl ligand 
proteins dissolved in 0.1 ml of 0.15 H NaCl are added to the 
cytotoxic protein solution, the pH is kept above 9.0, and 
the mixture is incubated at 4*C for 24 hours. At this 
stage, 0.1 ml of 0.2 M ethanolamine (pH 7.0) is added and 
the mixture incubated for a fxurther 2 ho\irs at 4*C. 

The cytotoxic protein-new MUCl ligand conjugate is then 
separated from the unconjugated protein molecules by either 
gel filtration or gel electrophoresis. 

For -periodate coupling, the new MUCl ligand proteins 
are resuspended in 1.2 ml of water and freshly-prepared 
0.1 M sodixim periodate (0.3 ml) in 10 mM sodium phosphate 
buffer (pH 7.0) is added. The mixture is incubated at room 
temperature for 20 minutes and then dialysed against 1 mM 
sodium acetate (pH 4.0) at 4^C with several changes 
overnight. A 0.5 ml solution (10 mg/ml) of the cytotoxic 
protein (for example, ricin) is prepared in 20 mM sodium 
carbonate buffer (pH 9.5) and added to the solution of the 
periodate treated new MUCl ligand proteins. The mixture is 
incubated at room temperature for 2 hours. The Schiff's 
bases that have formed are then reduced by adding 100 
microliters of sodium borohydride (4 mg/ml) in water and 
incubating at 4'C for 2 hours. 

The cytotoxic protein-new MUCl ligand conjugate is then 
separated from the unconjugated protein molecultes by either 
gel filtration or gel electrophoresis. 

Cytotoxic protein^new MUCl ligand conjugates may also 
be prepared using recombinant DNA technology. In this 
method, recombinant bacteria are generated that synthesize 
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fusion proteins consisting of the cytotoxic protein fused to 
the new HUCl ligand proteins. 

It will be evident to those skilled in the art that the 
invention is not limited to the details of the foregoing 
illustrative embodiments and examples, and that the present 
invention may be embodied in other specific forms without 
departing from the essential attributes thereof, and it is 
therefore desired that the present embodiments be considered 
in all respects as illustrative and not restrictive, 
reference being made to the appended claims, rather than to 
the foregoing description, and all changes which come within 
the meaning and range of equivalency of the claims are 
therefore intended to be embraced therein. 
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WHAT IS CLAIMED IS: 

1. A biochemically pure HUCl protein, selected from 
the group consisting of MUCl/X^ MUCl/X/alt, MUCl/Y, 
MUCl/Y/alt, mCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z 
and MUCl/Z/alt or a functional derivative thereof devoid of 
a tandem repeat array. 

2. A MUCl protein according to claim 1 or a functional 
derivative thereof, conqprising a partial amino acid 
sequence 

MTPGTQSPFFLLLLLTVLT[ATTAPKPAT1 
VVTGSGHASSTPGGEKETSATQRSSVP 
S S T E K N A 

and devoid of a tandem repeat array downstream thereof. 

3 . A MUCl protein according to claim 1 or a functional 
derivative thereof, comprising a partial amino acid 
sequence 

MTPGTQSPFFLLLLLTVLT [ATTAPKPAT] 
VVTGSGHASSTPGGEKETSATQRSSVPS 

and devoid of a tandem repeat array downstream thereof. 

4. Biochemically pure MUCl/X and MUCl/X/alt, respectively 
comprising the sequences shown in Figs. 5A and 5B, or 
fxinctional derivatives thereof. 
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5. Biochemically pure MOCl/Y and MUCl/Y/alt, respectively 
comprising the sequences shown in Figs. 6A and 6B, or 
functional derivatives thereof* 

5. Biochemically p\ire MUCl/V and MUCl/V/alt^ respectively 
comprising the sequences shown in Figs. 6C and or 
functional derivatives thereof. 

7. Biochemically pure MUCl/W and MUCl/W/alt, respectively 
comprising the sequences shown in Figs. 7 A and 7B, or 
functional derivatives thereof. 

8. Biochemically pure MUCl/Z and MUCl/Z/alt, respectively 
comprising the sequences shown in Figs. 8A and 8B, or 
functional derivatives thereof. 

9. A pharmaceutical composition, comprising as an active 
ingredient therein a biochemically pure MUCl protein 
selected from the group consisting of MUCl/X, MUCl/X/alt, 
MUCl/Y, MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, 
MUCl/Z, MUCl/Z/alt and functional derivatives thereof, in 
combination with a pharmaceutically acceptable carrier. 
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10. A pharmaceutical composition for the treatment of 
human breast cancer , comprising as an active ingredient 
therein a biochemically pure MUCl protein selected from the 
group consisting of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, 
MUCl/V^ MUCl/V/alt, and functional derivatives thereof^ in 
soluble form and in combination with a pharmaceutically 
acceptable carrier. 

11. A conjugated toxin for the treatment of human breast 
cancer, comprising a MUCl protein selected from the group 
consisting of MUCl/W, MUCl/W/alt, MUCl/2, MUCl/Z/alt and 
functional derivatives thereof, attached to a cytotoxic 
agent. 

12. A diagnostic agent for the detection of hximan breast 
cancer cells, comprising a detectably labelled, 
biochemically pure MUCl protein selected from the group 
consisting of MUCl/w, MUCl/W/alt, MUCl/Z, MUCl/Z/alt and 
functional derivatives thereof. 

13. A diagnostic agent for identification of sites in the 
body to which breast cancer cells have spread, comprising a 
detectably labelled MUCl protein selected from the group 
consisting of MUCl/W, MUCl/W/alt, MUCl/Z, MUCl/Z/alt and 
functional derivatives thereof- 
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14. A method for the treatment of human breast cancer/ 
comprising administering to an individual having human 
breast cancer cells an amoxint of soluble MUCl/X, MUCl/X/alt, 
MUCl/Y, MUCl/Y/alt, MUCl/V, or MUCl/V/alt receptors, 
sufficient to inhibit the binding of MUCl ligands to said 
cells . 

15. A method for the treatment of human breast cancer, 
conqprising administering to an individual having human 
breast cancer cells an amount of a ligand- toxin conjugant 
comprising a ligand selected from MUCl/W, MUCl/w/alt, MUCl/Z 
or MUCl/2/alt, fused to a cytotoxic toxin. 

16. A DNA sequence encoding the protein MUCl/X, comprising 
the nucleotide sequence siabstantially as shown in Fig. 5A or 
a functional derivative thereof devoid of a tandem repeat 
array. 

17. A DNA sequence encoding the protein MUCl/X/alt, 
comprising the nucleotide sequence substantially as shown in 
Fig. 5B or a functional derivative thereof devoid of a 
tandem repeat array. 

18. A DNA sequence encoding the protein MUCl/Y, comprising 
the nucleotide sequence substantially as shown in Fig. 6A or 
a functional derivative thereof devoid of a tandem repeat 
array. 
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19. A DNA sequence encoding the protein MUCl/Y/alt, 
comprising the nucleotide sequence substantially as shown in 
Fig. 6B or a functional derivative thereof devoid 
of a tandem repeat array- 

20. A DNA sequence encoding the protein MUCl/V, comprising 
the nucleotide sequence substantially as shown in Fig. 6C or 
a functional derivative thereof devoid of a tandem repeat 
array- 

21. A DNA sequence encoding the protein MUCl/V/alt, 
comprising the nucleotide sequence substcuitially as shown in 
Fig. 6D or a functional derivative thereof devoid of a 
tandem repeat array. 

22. A DNA sequence encoding the protein MUCl/W^ comprising 
the nucleotide sequence s\ibstantially as shown in Fig. 7A or 
a functional derivative thereof devoid of a tandem repeat 
array . 

23. A DNA sequence encoding the protein MUCl/W/alt^ 
comprising the nucleotide sequence substantially as shown in 
Fig. 7B or a functional derivative thereof devoid of a 
tandem repeat array. 

24. A DNA sequence encoding the protein MUCl/Z, comprising 
the nucleotide sequence substantially as shown in Fig. 8 A or 
a fxmctional derivative thereof devoid of a tandem repeat 
array. 
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25. A DNA sequence encoding the protein MUCl/Z/alt, 
comprising the nucleotide sequence substantially as shown in 
Fig. 8B or a fiinctional derivative thereof devoid of a 
tandem repeat array* 

26. A DNA sequence according to any of claims 16*25, being 
a cDNA. 

27. An in-vitro bioassay for determining the presence of 
breast cancer cells in a sample, comprising contacting a 
tissue sample with a diagnostic agent, said agent comprising 
a detectable labelled MUCl protein selected from the group 
consisting of MUCl/W, MUCl/W/alt, MOCl/Z, MUCl/Z/alt and 
functional derivatives thereof. 

28. An in-vitro bioassay for determining the presence of 
breast cancer cells in a saii5>le, comprising: 

a) isolating a specimen selected from the group 
consisting of tissue and cell biopsies, and 

b) assaying said specimen with antibodies selected 
from the group consisting of monoclonal and polyclonal 
antibodies that recognize a protein selected from the group 
consisting of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, 
MUCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z, MUCl/2/alt 
and functional derivatives thereof. 
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29. A DNA construct selected from the group consisting of 
cDNA coding for a biochemically pure MUCl protein selected 
from the group consisting of MUCl/X, MUCl/X/alt, MUCl/Y, 
MUCl/Y/alt, MaCl/V, MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z 
and MUCl/Z/alt or a functional derivative thereof devoid of 
a tandem repeat array. 

30. The construct of claim 29, which is contained in a 
vector . 

31. A host cell transfected with the construct of claim 30. 

32. A bioassay for screening substances for the ability to 
inhibit mammary carcinoma, comprising: 

a) administering the substance to a cell transfectant 
that expresses a protein selected from the group consisting 
of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V, 
MUCl/V/alt, and functional derivatives thereof; and 

b) determining whether such sxibstance inhibits the 
growth of the cell transf ectant- 

33. A purified antibody which specifically binds a protein 
of claim 1. 

34. The antibody of claim 33, wherein said antibody is 
conjugated to a therapeutic drug. 
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35. The cuitibody of claim 33, wherein said antibody is 
conjugated to a detectable moiety. 

36. The antibody of claim 33, wherein said antibody is 
bo\ind to a solid support. 

37. A bioassay for determining the amount of a MUCl 
protein selected from the group consisting of MUCl/X, 
MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V, MUCl/V/alt, MUCl/W, 
MUCl /W/ alt, MUCl/Z and MUCl /Z /alt or a functional derivative 
thereof devoid of a tandem repeat array in- a biological 
sample, comprising: 

a) contacting said biological sample with an antibody 
under conditions such that a specific complex of said 
emtibody and said MUCl protein Ccin be formed; and 

b) determining the amount of said antibody/MUCl 
protein complex, the amount of the complex indicating the 
amount of said MUCl protein in the biological sample. 

38 • A method of detecting the presence of cancer in a 
subject, comprising determining the presence of a detectable 
amount of a MUCl protein selected from the group consisting 
of MUCl/X, MUCl/X/alt, MUCl/Y, MUCl/Y/alt, MUCl/V, 
MUCl/V/alt, MUCl/W, MUCl/W/alt, MUCl/Z and MUCl/Z/alt or a 
functional derivative thereof devoid of a tandem repeat 
array in a biopsy from said siibject, the presence of a 
detectable amount of said MUCl protein relative to the 
absence of said MUCl protein in a normal control indicating 
the presence of cancer. 
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39. A method of determining the prognosis of a sxibject 
having cancer, comprising determining the presence of a 
detectable amount of a MUCl protein selected from the group 
consisting of MUCl/X, MUCl/X/alt, MUCl/Y^ MUCl/Y/alt, 
MUCl/V^ MUCl/V/alt, MUCl/W, MOCl/W/alt, MUCl/Z and 
MUCl/Z/alt or a functional derivative thereof devoid of a 
tandem repeat array in a biopsy from in said subject, the 
presence of a detectable amount of said MUCl protein 
relative to the absence of said MUCl protein in a normal 
control indicating a decreased chance of long-term 
survival. 
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4/15 

30 . .60 

ATGACACCGGGCACCCAGTCJCCTTJCTTCCrGCJGCTGCTCCTCACAGTGCUACAGTT 
MTPGTQSPFFLLLLLTVLTV 

20 

90 . , 120 

GTJACA6GTJCTGGTCAT6CAAGCTCTACCCCAGGTGGAGAAAA6GAGACTTCG6CTACC 
VJGSGHASSTPGGEKETSAT 

40 

150 . . 180 

CAGAGAAGTTCAGTGCCCASCTCTACTGAGAASAATGCTTTGTCTACTGGGGTCTCTTTC 
QRSSVPSSTEKNALSTGVSF 

60 

210 . . 240 

TTTTTCCTGJCTTTTCACATTTCAAACCTCCAGTTTAATTCCTCTCTGGAAGATCCCAGC 
FFLSFHISNLQFNSSLEDPS 

BO 

270 . . 300 

ACC6ACTACTACCAAGAGCTGCAGAGAGACATTTCTGAAAT6TTTTTGCAGATTTATAAA 
TDVYQELQRDISEMFLQIYK 

100 

330 . . 360 

CAAGGGGGTTTTCTGGGCCTCTCCAATATTAAGTTCAGGCCAGGATCTGTGGTGGTACAA 
QG6FLGLSNIKFRPGSVVVQ 

120 

390 . . 420 

TTGACTCJGGCCTTCCGAGAAGGJACCATCAATGJCCACGACGTGGAGACACAGnCAAT 
LTLAFREGTINVHDVETQFN 

140 

450 . . 480 

CAGTATAAAACGGAAGCAGCCTCTCGATATAACCTGACGATCTCAGACGTCAGCGTGAGT 
QYKJEAASRYNLTISDVSVS 

160 

510 . . 540 

GATGTGCCATTTCCTTTCTCTGCCCAGTCTGGGGCT6G66T6CCA6GCTGGGGCATCGCG 
DVPFPFSAQSGAGVPGWGIA 

180 

570 . . 600 

CTGCTGGTGCJGGTCJ6TGTTCTG6TTGC6CTGGCCATTGTCTATCTCATTGCCTJGGCJ 
LLVLVCVLVALAIVYLIALA 

200 

630 . . 660 

6TCTGTCAGTGCCGCC6AAAGAACTACGGGCAGCTGGACATCTTTCCAGCCCGG6ATACC 
VCQCRRKNYGQLDIFPARDT 

220 

690 . . 720 

TACCATCCTAT6AGCGAGTACCCCACCTACCACACCCATGGGCGCTATGT6CCCCCTAGC 
YHPMSEYPTYHJHGRYVPPS 

240 

750 . . 780 

AGTACCGATCGTA6CCCCTATGAGAAGGTTTCTGCAGGTAATG6T6GCAGCA6CCTCTCT 
STDRSPYEKVSAGUGGSSLS 

260 

810 

TACACAAACCCAGCAGTGGCAGCCACTTCTGCCAACTTGTAG 
YTNPAVAAJSANLU 

Fig-5A 
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30 . .60 

ATGACACCGGGCACCCAGTCTCCTTTCTTCCTGCTGCTGCTCCTCACAGTGCTTACAGCT 
MTPGTQSPFFLLLLLTVLTA 

20 

90 . . 120 

ACCACAGCCCCJAAACCCGCAACA6JTGTTACAGGTTCTGGTCATGCAAGCTC7ACCCCA 
TTAPKPATVVTGSGHASSTP 

40 

150 . . 180 

GGTGGAGAAAAGGAGACnCGGCTACCCAGAGAAGTTCAGTGCCCAGCTCJACTGAGAAG 
G6EKETSATQBSSVPSSJEK 

60 

210 . . 240 

AAJGCTJTGJCTACJGGGGTCTCJTTCTTTTTCCTGTCTnrCACATnCAAACCTCCAG 
tiALSTGVSFFFLSFHISNLQ 

80 

270 . - 300 

TTTAATTCCTCTCTGGAAGATCCCAGCACCGACTACTACCAAGAGCTGCAGAGAGACAJT 
FNSSLEDPSTDVYQELQBDl 

100 

330 . . 360 

JCTGAAATGnmGCAGAnJATAAACAAGGGGGTTTTCTGGGCCTCTCCAATATTAAG 
SEMFLQIYKQGGFLGLSNIK 

120 

390 . . 420 

TJCAGGCGAGGAICIGIGGIGGIACAAITGACJCJGGCCIJCGGAGAAGGIACCAJCAAI 
FRPGSVVVQLJLAFREGTIN 

140 

450 . . 480 

GJCCACGACGTGGAGACACAGTTCAATCAGTATAAAACGGAAGCAGCCTCTCGATATAAC 
VHDVETQFNQYKTEAASRYH 

160 

510 . . 540 

CTGACGATCJCAGACGTCAGCGTGAGJGATGJGCCATTTCCTTTCTCTGCCCAGTCJGGG 
LTISDVSVSDVPFPFSAQSG 

IBO 

570 . . 600 

GCTGGGGTGCCAGGCTGGGGCATCGCGC7GCTGGTGCTGGTCT6TGTTCTGGTT6CGCT6 
AGVPGVGIALLVLVCVLVAL 

200 

630 . . 660 

GCCATJGJCTAJCICAIJGCCJJGGCTGTCTGJCAGTGCCGCCGAAAGAACTACGGGCAG 
AIVYLIALAVCQCRRKNYGQ 

220 

690 . . 720 

CTGGACATCTTTCCAGCCCGGGATACCTACCATCCTATGAGCGAGTACCCCACCTACCAC 
LDIFPARDTYHPMSEYPTYH 

240 

750 . . 780 

ACCCATGGGCGCTAJGTGCCCCCTAGCAGTACCGATCGTAGCCCCTATGAGAAGGTTTCT 
THGRYVPPSSTDRSPYEKVS 

260 

810 . . 840 

GCAGGIAAJGGTGGCAGCAGCCTCTCTTACACAAACCCAGCAGTGGCAGCCACTTCTGCC 
AGNGGSSLSYTNPAVAATSA 

280 



AACTJGTAG 
N L U 
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30 . .60 

ATGACACCGGGCACCCAGJCTCCTT7CTTCCTGCTGCTGCTCCTCACAGTGCJTACAGTT 
MTP6TQSPFFLLLLLTVLTV 

20 

90 , . 120 

GIJACAGGJICIGGICAIGCAAGCICIACCCCAGGTGGAGAAAAGGAGACIICGGCIACC 
VT6SGHASS T P GGEKETSAT 

40 

150 . . 180 

CAGA6AAGTTCAGTGCCCA6CTCTACT6AGAAGAATGCTTT7AATTCCTCTCTGGAAGAT 
QRSSVPSSTEKNAFNSSLED 

60 

.210 . . 240 

GCCAGCACOGACJACJACCAAGAGCJGCAGAGAGACAIJICIGAAAIGITIIIGCAGAU 
PSTDYYQE LQRDISEMFLQI 

BO 

270 . . 300 

lAlAAACAAGGGGGUJrClGGGCClClCCAAlAJTAAGJJCAGGCCAGGAJClGJGGlG 
YKQGGFLGLS N IKFflPGSVV 

400 

330 . . 360 

6TACAATT6ACTCTGGCCTTCC6AGAA6GTACCATCAAT6TCCAC6ACGTGGAGACACAG 
VQIILAFBEGTIHVHOVEIQ 

120 

390 . . 420 

IJCAATCAGIAJAAAACGGAAGCAGCCJCICGAIAIAACCJGACGAICICAGACGICAGC 
FNQYKTEAASaYNLTISOVS 

140 

450 . . 480 

GTGAGT6AT6T6CCATTTCCTTTCTCT6CCCAGTCT66GGCT6G66TGCCA6GCT6GGGC 
y/SDVPFPFSAQSGAG^PGViG 

ISO 

510 . . 540 

AJCGCBClGCJGGlGCTGGlCJGJGnCJGGlJGCGCJGGCCAJJGlCJAlCJCAJJGCC 
lALLVLVCVLVALAIVYLlA 

180 

570 . . 600 

JJGGGIGICIGICAGIGGGGCCGAAAGAACIACGGGCAGCIGGACAICJIICCAGCCCGG 
LAVCQCRRKNYGQLDIFPAP 

200 

630 . . 660 

eATACCTACCAJCCTATGAGCGAGTACCCCACCTACCACACCCATGGGCGCTATGTGCCC 
DTYHPMSEYPTYHTHGflYVP 

220 

690 . . 720 

CCTAGCAGJACCGATCGTAGCCCCTATGAGAAGGJTTCTGCAeGTAATGGTeGCAGCAGC 
PSSrORSPYEKVSAGHGGSS 

240 

750 . 780 

CTCTCTTACACAAACCCAGCAGTGGCAGCCACTTCT6CCAACTTGTAG 
LSYTNPAVAATSANLU 
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30 . .60 

ATGACACCGGGCACCCAGTCJCCTTTCTTCCTGCTGCTGCTCCTCACAGTGCTTACAGTT 
MTPGTQSPFFLLLLLTVLTV 

20 

90 . . 120 

ACCACAGCCCC7AAACCCGCAACAGTTGT7ACAGGTTCTGGJCATGCAAGCJCTACCCCA 
T T A P K P A JVVT GSGHASSTP 

40 

150 . . 180 

GGTGGAGAAAAGGAGACTTCGGCTACCCA6AGAAGTTCAGTGCCCAGCTCJACTGAGAAG 
GGEKE7SATQRSSVPSSTEK 

60 

210 . . 240 

AATGCTTTTAATTCCTCTCTGGAAGA7CCCAGCACCGACTACTACCAAGAGCT6CAGAGA 
NAFHSSLEDPSIDYYQE LQR 

80 

270 . . 300 

GACATTTCTGAAATGTTTTTGCAGATTTATAAACAAGGGGGTTTTCTGGGCCTCTCCAAT 
DISEM FLQIYKQGGFLGLSN 

100 

330 . . 360 

AUAAGJJCAGGCCAGGATCTGTGGTGGTACAATJGACTCTGGCCTJCCGAGAAGGTACC 
IKFRPGSVVVQLTLAFREGT 

120 

390 . . 420 

ATCAAJGTCCACGACGTGGAGACACAGJTCAATCAGTATAAAACGGAAGCAGCCTCTCGA 
INVHDVETQFNQYKTEAASR 

140 

450 . . 480 

TATAACCTGACGATCTCAeACGTCAGCGTGAGTGAJGTGCCATnCCmCTCTGCCCAG 
YNL7ISDVSVSDVPFPFSAQ 

160 

510 . . 540 

TCTGGGGCTGGGGTGCCAGGCTGGGGCATCGCGCJGCTGGTGCreGTCTGTGTTCTGGTT 
SGAGVPG^GIALLVLVCVLV 

180 

570 . . 600 

GCGCTG6CCATTGTCTATCTCATTGCCT76GCT6TCTGTCA6TGCCGCCGAAAGAACTAC 
ALAIVYLIALAVCQCRRKNY 

200 

630 . . 660 

GGGCAGCTGGACATCmCCAGCCCGGGATACCTACCAJCCJATGAGCGAGTACCCCACC 
GQLDIFPARDTYHPMSEYPT 

220 

690 . . 720 

JACCACACCCATGGGCGCTATGTGCCCCCTAGCAGTACCGAJCGTAGCCCCTATGAGAAG 
YHTHGRYVPPSSTDRSPYEK 

240 

750 . . 780 

GTJTCJGCAGGTAATGGJGGCAGCAGCCTCTCTTACACAAACCCAGCAGTGGCAGCCACT 
ySAGNGGSSLSYTNPAVAAT 

260 

rCTGCCAACTTGTAG 
S A H L U 

Fig-6B 
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30 . .60 

ATGACACCGGGCACCCAGTCrCCTUClTCCTGCTGCTGCTCCJCACAGTGCTJACAGTT 
MTPGTQS PFFLLLLLTVLTV 

20 

90 . . 120 

G7JACAGGT7CTGGTCA7GCAAGCTCTACCCCAGGTGGAGAAAAGGAGACUCGGCJACC 
VTGSGHASSTPGGEKETSAT 

40 

150 . . 180 

CAGAGAAGTJCAGTGCCCAGCACCGACTACTACCAAGAGCTGCAGAGAGACATTTCTGAA 
QRSSVPSTDYYQELQRDISE 

60 

210 . . 240 

AT6TTTTT6CA6ATTTATAAACAAGG666TTTTCT666CCTCTCCAATATTAA6UCAGG 
MFLQIYKQGGFLGLSNIKFR 

BO 

270 . - 300 

CCAGGAlCTGTGGTGGJACAAnGACTCTGGCCJTCCGAGAAGGJACCAJCAATGJCCAC 
PGSVVVQLTLAFRE6TINVH 

100 

330 . . 360 

GACGIGGAGACACAGUCAATCAGJATAAAACGGAAGCAGCCTCJCGATATAACCTGACG 
DVETQFNQYKTEAASRYNLT 

120 

390 . . 420 

ATClCAGACGTCAGCGTGAGrGATGTGCCAmCCTTTCTCJGCCCAGTCTGGGGCJGGG 
ISDVSVSDVPFPFSAQSGAG 

140 

450 . . 480 

GTGCCAGGCTGGGGCATCGCGCJGCTGGTGCTGG7CTGTG7TCJGGT7GCGCTGGCCAJJ 
VPGWGIALLVLVCVLVALAI 

160 

510 . . 540 

GTCTATCTCAJTGCCTTGGCTGTCTGTCAGTGCCGCCGAAAGAACTACGGGCAGCTGGAC 
V YLIALAVCOCflflKNYGGLD 

180 

570 . . 600 

ATCinCCAGCCCGGGAJACCTACCATCCJATGAGCGAGTACCCCACCTACCACACCCAJ 
I F P A R D 7 Y H P M S t Y P J Y H 7 H 

200 

630 . . 660 

666C6CTAT6TGCCCCCTAGCA6TACCGATCGTAGCCCCTATGA6AA6GTTTCTGCAGGT 
GRYVPPSSTDRSPYEKVSA6 

220 

690 . . 720 

AAT6GTGGCAGCA6CCTCTCTTACACAAACCCAGCAGTGGCAGCCACTTCTGCCAACTT6 
NGGSSLSYTHPAVAA7SANL 

240 

TAG 
U 
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30 . -60 

ATGACACCGGGCACCCAGTCTCCnTCTTCCJGCTGCTGCTCCTCACAGTGCTTACAGCT 
MTP6TQSPFFLLLLLTVLTA 

20 

90 . . 120 

;\CC-4CAeCCCCTA4ACCC6C>AACA6TT6TT>AC;\6GTTCT6GTCAT6C4A6CTCT4CCCC4 
TTAPKPATVVT6SGHASSTP 

40 

.150 . . 180 

GGTGGAGAAAAGGAGACTTCGGCTACCCAGAGAAGTTCAGTGCCCAGCACCGACJACJAC 
GGEKETSATQRSSVPSTDYY 

60 

210 . . 240 

CAAGAGCTGCAGAGAGACATnCTGAAAJGTJJTJGCAGATTJAJAAACAAGGGGGTTTT 
QELQRDISEMFLQIYKQGGF 

60 

270 . . 300 

CTGGGCCTCTCCAATAUAAGTTCAGGCCAGGATCTGTGGTGGTACAATTGACTCTGGCC 
LGLSNIKFRPGSVVVQLTLA 

100 

330 . . 360 

TTCCGAGAAGGTACCATCAAJGJCCACGACGJGGAGACACAGTTCAATCAGTATAAAACG 
FREGTINVHDVETQFNQYKT 

i20 

390 . . 420 

GAAGCAGCCTCTCGATATAACCTGACGATCJCAGACGTCAGCGTGAGTGAJGTGCCATTT 
EAASRYNLTISDVSVSDVPF 

140 

450 . . 480 

CCTTTCTCTGCCCAGTCTG6G6CTGGGGT6CCAGGCTGGSGCATCGCGCTGCTG6T6CTG 
PFSAQSGAGVPGi^GlALLVL 

160 

510 . , 540 

GTCTGTGTTCTGGTJGCGCTGGCCATTGJCTATCTCAnGCCTTGGCJGTCrGTCAGJGC 
VCVLVALAIVYLIALAVCQC 

180 

570 . . 600 

CGCCGAAAGAACTAC6GGCAGCTG6ACATCTTTCCAGCCCG6GATACCTACCATCCTATG 
RRKNYGQLDIFPARDTYHPM 

200 

630 . . 660 

AGCGAGTACCCCACCTACCACACCCATGGGCGCTATGJGCCCCCTAGCAGJACCGATCGT 
SEYPTYHTHGflYVPPSSTDfl 

220 

. . 690 . . 720 

AGCCCCTATGAGAAGGTTTCTGCAGGTAAJGGTGGCAGCAGCCTCTCJTACACAAACCCA 
SPYEKVSAGNGGSSLSYTNP 

240 

750 

GCA6TGGCAGCCACTTCTGCCAACTT6TA6 
AVAATSANLU 
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30 . .60 

AJGACACC6GGCACCCA6TCTCCTrTCTTCCT6CJGCT6CTCCTCACAGJ6CTTACAGTT 
MTPGTQSPFFLLLLLTVLTV 

20 

90 . . 120 

GTTACAGGTTCTGGTCATGCAAGCTCTACCCCAGGTGGAGAAAAGGAGACTTCGGCTACC 
VJGSGHASSTPGGEKETSAT 

40 

150 . . 180 

CAGAGAAGTTCAGTGCCCAGCTCTACTGAGAAGAATGCTCACTTC7CCCCAGTJGTCJAC 
QBSSVPSSTEKNAHFSPVVY 

60 

210 

TGGGGTCTCTTTCTTTTTTCTGTCTTTTCACATTTCAAACCTCCAGTTTAA 
WGLFLFPVFSHFK PPVU 



Fig-7A 



30 . .60 

ATGACACCGGGCACCCAGTCTCCTTJCnCCTGCTGCTGCTCCJCACAGTGCTJACAGCT 
MTP6TQSPFFLLLLLTVLTA 

20 

90 . . 120 

ACCACAGCCCCJAAACCCeCAACAGUGrTACAGGTTCTGGJCAJGCAAGCTCTACCCCA 
TTAPKPATVVTGSGHASSTP 

40 

150 . . 180 

GGJGGAGAAAAGGAGACnCGGCTACCCAGAGAAGTTCAGTGCCCAGCTCJACTGAGAAG 
GGEKETSATQRSS VPSSTEK 

60 

210 . . 240 

AATGCTCACnCTCCCCAGTTGJCTACJGGGGJCJCTTJCTTTTTCCTGTCTTTTCACAT 
NAHFSPVVYWGLFLFPVFSH 

80 

TTCAAACCTCCAGTTTAA 
F K P P V U 
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30 . .60 

ATGACACCGGGCACCCAGJCJCCnJCTJCCTGCTGCTGCTCCTCACAGTGCJTACAGTJ 
MTPGTQSPFFLLLLLTVLTV 

20 

90 . . 120 

GTJACAGGJTCJGGrCATGCAAGCTCTACCCCAGGTGGAGAAAAGGAGACTJCGGCJACC 
VTGSGHASSJPGGEKETSAT 

40 

150 . . 180 

CAGAGAAGUCAGJGCCCAGCTCTACJGAGAAGAAJGCTATCCCAGCACCGACJACTACC 
QRSSVPSSTEKNAIPAPTTT 

60 

,210 . . 240 

AAGAGCTGCAGAGAGACAmCJGAAATGTTTTTGCAGATTTATAAACAAGGGGGTTTTC 
KSCRETFLKCFCRFINKGVF^^ 

BO 

270 . 
TGGGCCTClCCAATAnAAGnCAGGCCAGGATCTGTGGTGGJACAATJGA 
WASPILSSGQOLi^liYNU 



Fig-8A 



30 . .60 

ATGACACCGGGCACCCAGTCTCCTTTCTTCCTGCTGCTGCTCCTCACAGTGCTTACAGCT 
MTPGTQSPFFLLLLLTVLTA^^ 

20 

90 . . 120 

ACCACAGCCCCTAAACCCGCAACAGTTGTTACAGGTTCrGGTCATGCAAGCTCJACCCCA 
JJAPKPATVVTGSGHASSTP 

40 

150 . . 180 

GGTGGAGAAAAGGAGACrJCGeCTACCCAGAGAAGTTCAGTGCCCAGCTCTACTGAGAAG 
G GEKETSATGflSSVPSSTEK 

60 

210 . . 240 

AA7GCTATCCCAGCACCGACJACJACCAAGAGCT6CAGAGAGACATTTCTGAAATGTTTT 
NAIPAPTTTKSCRETFLKCF 

80 

270 . . 300 

TGCAGATTTATAAACAAGGGGGTTUCTGGGCCTCTCCAATATTAAGTTCAGGCCAGGAT 
CPFINKGVFyfASPILSSGQD 

100 

CTGJGGTGGTACAATTGA 
L W W Y N U 
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RECOGNITION SPECIFICITIES OF SH2 DOMAINS 
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